CATSplat: Context-Aware Transformer with Spatial Guidance for Generalizable 3D Gaussian Splatting from A Single-View Image

CATSplat: Context-Aware Transformer with Spatial Guidance for Generalizable 3D Gaussian Splatting from A Single-View Image

Wonseok Roh, Hwanhee Jung, Jong Wook Kim, Seunggwan Lee, Innfarn Yoo, Andreas Lugmayr, Seunggeun Chi, Karthik Ramani, Sangpil Kim

Recently, generalizable feed-forward methods based on 3D Gaussian Splatting have gained significant attention for their potential to reconstruct 3D scenes using finite resources. These approaches create a 3D radiance field, parameterized by...

First-Person View Hand Segmentation of Multi-Modal Hand Activity Video Dataset

First-Person View Hand Segmentation of Multi-Modal Hand Activity Video Dataset

Sangpil Kim, Hyung-gun Chi, Xiao Hu, Anirudh Vegesana, Karthik Ramani
In proceedings of the 31st British Machine Vision Conference (BMVC)

Abstract:  First-person-view videos of hands interacting with tools are widely used in the computer vision industry. However, creating a dataset with pixel-wise segmentation of hands is challenging since most videos are captured with fingertips...

A Large-scale Annotated Mechanical Components Benchmark for Classification and Retrieval Tasks with Deep Neural Networks

A Large-scale Annotated Mechanical Components Benchmark for Classification and Retrieval Tasks with Deep Neural Networks

Sangpil Kim*, Hyung-gun Chi*, Xiao Hu, Qixing Huang, Karthik Ramani
In proceedings of 16th European Conference on Computer Vision (ECCV)

We introduce a large-scale annotated mechanical components benchmark for classification and retrieval tasks named Mechanical Components Benchmark (MCB): a large-scale dataset of 3D objects of mechanical components. The dataset enables data-driven...

Latent transformations neural network for object view synthesis

Latent transformations neural network for object view synthesis

Sangpil Kim, Nick Winovich, Hyung-Gun Chi, Guang Lin, Karthik Ramani
The Visual Computer (2019): 1-15.

We propose a fully convolutional conditional generative neural network, the latent transformation neural network, capable of rigid and non-rigid object view synthesis using a lightweight architecture suited for real-time applications and embedded...

Learning Hand Articulations by Hallucinating Heat Distribution

Learning Hand Articulations by Hallucinating Heat Distribution

Chiho Choi, Sangpil Kim, Karthik Ramani
Proceedings of the IEEE International Conference on Computer Vision, 3104-3113

We propose a robust hand pose estimation method by learning hand articulations from depth features and auxiliary modality features. As an additional modality to depth data, we present a function of geometric properties on the surface of the hand...