We introduce the Multi-Motion Discrete Diffusion Models (M2D2M), a novel approach for human motion generation from textual descriptions of multiple actions, utilizing the strengths of discrete diffusion models. This approach adeptly addresses the...
InfoGCN++: Learning Representation by Predicting the Future for Online Human Skeleton-based Action Recognition
Skeleton-based action recognition has made significant advancements recently, with models like InfoGCN showcasing remarkable accuracy. However, these models exhibit a key limitation: they necessitate complete action observation prior to...
AdamsFormer for Spatial Action Localization in the Future
Predicting future action locations is vital for applications like human-robot collaboration. While some computer vision tasks have made progress in predicting human actions, accurately localizing these actions in future frames remains an area with...
Simplification of 3D CAD Model in Voxel Form for Mechanical Parts Using Generative Adversarial Networks
Pose Relation Transformer: Refine Occlusions for Human Pose Estimation
Accurately estimating the human pose is an essential task for many applications in robotics. However, existing pose estimation methods suffer from poor performance when occlusion occurs. Recent advances in NLP have been very successful in...
InfoGCN: Representation Learning for Human Skeleton-based Action Recognition
Human skeleton-based action recognition offers a valuable means to understand the intricacies of human behavior because it can handle the complex relationships between physical constraints and intention. Although several studies have focused on...
Object Synthesis by Learning Part Geometry with Surface and Volumetric Representations
First-Person View Hand Segmentation of Multi-Modal Hand Activity Video Dataset
Abstract: First-person-view videos of hands interacting with tools are widely used in the computer vision industry. However, creating a dataset with pixel-wise segmentation of hands is challenging since most videos are captured with fingertips...
A Large-scale Annotated Mechanical Components Benchmark for Classification and Retrieval Tasks with Deep Neural Networks
We introduce a large-scale annotated mechanical components benchmark for classification and retrieval tasks named Mechanical Components Benchmark (MCB): a large-scale dataset of 3D objects of mechanical components. The dataset enables data-driven...