M2D2M: Multi-Motion Generation from Text with Discrete Diffusion Models

M2D2M: Multi-Motion Generation from Text with Discrete Diffusion Models

Seunggeun Chi*, Hyung-gun Chi*, Hengbo Ma, Nakul Agarwal, Faizan Siddiqui, Karthik Ramani, Kwonjoon Lee
In European Conference on Computer Vision, 2024.

We introduce the Multi-Motion Discrete Diffusion Models (M2D2M), a novel approach for human motion generation from textual descriptions of multiple actions, utilizing the strengths of discrete diffusion models. This approach adeptly addresses the...

AdamsFormer for Spatial Action Localization in the Future

AdamsFormer for Spatial Action Localization in the Future

Hyung-Gun Chi, Kwonjoon Lee, Nakul Agarwal, Yi Xu, Karthik Ramani, Chiho Choi
2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Predicting future action locations is vital for applications like human-robot collaboration. While some computer vision tasks have made progress in predicting human actions, accurately localizing these actions in future frames remains an area with...

Pose Relation Transformer: Refine Occlusions for Human Pose Estimation

Pose Relation Transformer: Refine Occlusions for Human Pose Estimation

Hyung-gun Chi, Seung-gun Chi, Stanley Chan, Karthik Ramani
In 2023 International Conference on Robotics and Automation (ICRA)

Accurately estimating the human pose is an essential task for many applications in robotics. However, existing pose estimation methods suffer from poor performance when occlusion occurs. Recent advances in NLP have been very successful in...

InfoGCN: Representation Learning for Human Skeleton-based Action Recognition

InfoGCN: Representation Learning for Human Skeleton-based Action Recognition

Hyung-gun Chi, Myoung Hoon Ha, Seunggeun Chi, Sang Wan Lee, Qixing Huang, and Karthik Ramani
In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022

Human skeleton-based action recognition offers a valuable means to understand the intricacies of human behavior because it can handle the complex relationships between physical constraints and intention. Although several studies have focused on...

First-Person View Hand Segmentation of Multi-Modal Hand Activity Video Dataset

First-Person View Hand Segmentation of Multi-Modal Hand Activity Video Dataset

Sangpil Kim, Hyung-gun Chi, Xiao Hu, Anirudh Vegesana, Karthik Ramani
In proceedings of the 31st British Machine Vision Conference (BMVC)

Abstract:  First-person-view videos of hands interacting with tools are widely used in the computer vision industry. However, creating a dataset with pixel-wise segmentation of hands is challenging since most videos are captured with fingertips...

A Large-scale Annotated Mechanical Components Benchmark for Classification and Retrieval Tasks with Deep Neural Networks

A Large-scale Annotated Mechanical Components Benchmark for Classification and Retrieval Tasks with Deep Neural Networks

Sangpil Kim*, Hyung-gun Chi*, Xiao Hu, Qixing Huang, Karthik Ramani
In proceedings of 16th European Conference on Computer Vision (ECCV)

We introduce a large-scale annotated mechanical components benchmark for classification and retrieval tasks named Mechanical Components Benchmark (MCB): a large-scale dataset of 3D objects of mechanical components. The dataset enables data-driven...

Latent transformations neural network for object view synthesis

Latent transformations neural network for object view synthesis

Sangpil Kim, Nick Winovich, Hyung-Gun Chi, Guang Lin, Karthik Ramani
The Visual Computer (2019): 1-15.

We propose a fully convolutional conditional generative neural network, the latent transformation neural network, capable of rigid and non-rigid object view synthesis using a lightweight architecture suited for real-time applications and embedded...