CARING-AI: Towards Authoring Context-aware Augmented Reality INstruction through Generative Artificial Intelligence

CARING-AI: Towards Authoring Context-aware Augmented Reality INstruction through Generative Artificial Intelligence

Jingyu Shi*, Rahul Jain*, Seung-gun Chi*, Hyungjun Doh, Hyung-gun Chi, Alexander J. Quinn and Karthik Ramani
CHI Conference on Human Factors in Computing Systems (CHI ’25)

Context-aware AR instruction enables adaptive and in-situ learning experiences. However, hardware limitations and expertise requirements constrain the creation of such instructions. With recent developments in Generative Artificial Intelligence...

Multi-Modal Representation Learning with Tactile Data

Multi-Modal Representation Learning with Tactile Data

Hyung-Gun Chi, Jose Barreiros, Jean Mercat, Karthik Ramani, Thomas Kollar
In 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

Advancements in embodied language models like PALM-E and RT-2 have significantly enhanced language-conditioned robotic manipulation. However, these advances remain predominantly focused on vision and language, often overlooking the pivotal role of...

M2D2M: Multi-Motion Generation from Text with Discrete Diffusion Models

M2D2M: Multi-Motion Generation from Text with Discrete Diffusion Models

Seunggeun Chi*, Hyung-gun Chi*, Hengbo Ma, Nakul Agarwal, Faizan Siddiqui, Karthik Ramani, Kwonjoon Lee
In European Conference on Computer Vision, 2024.

We introduce the Multi-Motion Discrete Diffusion Models (M2D2M), a novel approach for human motion generation from textual descriptions of multiple actions, utilizing the strengths of discrete diffusion models. This approach adeptly addresses the...

AdamsFormer for Spatial Action Localization in the Future

AdamsFormer for Spatial Action Localization in the Future

Hyung-Gun Chi, Kwonjoon Lee, Nakul Agarwal, Yi Xu, Karthik Ramani, Chiho Choi
2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Predicting future action locations is vital for applications like human-robot collaboration. While some computer vision tasks have made progress in predicting human actions, accurately localizing these actions in future frames remains an area with...

Pose Relation Transformer: Refine Occlusions for Human Pose Estimation

Pose Relation Transformer: Refine Occlusions for Human Pose Estimation

Hyung-gun Chi, Seung-gun Chi, Stanley Chan, Karthik Ramani
In 2023 International Conference on Robotics and Automation (ICRA)

Accurately estimating the human pose is an essential task for many applications in robotics. However, existing pose estimation methods suffer from poor performance when occlusion occurs. Recent advances in NLP have been very successful in...

InfoGCN: Representation Learning for Human Skeleton-based Action Recognition

InfoGCN: Representation Learning for Human Skeleton-based Action Recognition

Hyung-gun Chi, Myoung Hoon Ha, Seunggeun Chi, Sang Wan Lee, Qixing Huang, and Karthik Ramani
In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022

Human skeleton-based action recognition offers a valuable means to understand the intricacies of human behavior because it can handle the complex relationships between physical constraints and intention. Although several studies have focused on...

First-Person View Hand Segmentation of Multi-Modal Hand Activity Video Dataset

First-Person View Hand Segmentation of Multi-Modal Hand Activity Video Dataset

Sangpil Kim, Hyung-gun Chi, Xiao Hu, Anirudh Vegesana, Karthik Ramani
In proceedings of the 31st British Machine Vision Conference (BMVC)

Abstract:  First-person-view videos of hands interacting with tools are widely used in the computer vision industry. However, creating a dataset with pixel-wise segmentation of hands is challenging since most videos are captured with fingertips...