ARnnotate: An Augmented Reality Interface for Collecting Custom Dataset of 3D Hand-Object Interaction Pose Estimation

by Xun Qian | Nov 10, 2022

Authors: Xun Qian, Fengming He, Xiyun Hu, Tianyi Wang, Karthik Ramani

In Proceedings of the 35th Annual ACM Symposium on User Interface Software and Technology (pp. 1-14)

https://doi.org/10.1145/3526113.3545663

Paper

Video

Vision-based 3D pose estimation has substantial potential in hand-object interaction applications and requires user-specified datasets to achieve robust performance. We propose ARnnotate, an Augmented Reality (AR) interface enabling end-users to create custom data using a hand-tracking-capable AR device. Unlike other dataset collection strategies, ARnnotate first guides a user to manipulate a virtual bounding box and records its poses and the user’s hand joint positions as the labels. By leveraging the spatial awareness of AR, the user manipulates the corresponding physical object while following the in-situ AR animation of the bounding box and hand model, while ARnnotate captures the user’s first-person view as the images of the dataset. A 12-participant user study was conducted, and the results proved the system’s usability in terms of the spatial accuracy of the labels, the satisfactory performance of the deep neural networks trained with the data collected by ARnnotate, and the users’ subjective feedback.

Xun Qian

Xun Qian is a Ph.D. student in the School of Mechanical Engineering at Purdue University since Fall 2018. Before joining the C Design Lab, he received his Master's degree in Mechanical Engineering at Cornell University, and Bachelor's degree in Mechanical Engineering at University of Science and Technology Beijing. His current research interests lie in development of novel human-computer interactions leveraging AR/VR/MR, Deep Learning, and Cloud Computing. For more details, please visit his personal website at xun-qian.com