Canvas3D: Empowering Precise Spatial Control for Image Generation with Constraints from a 3D Virtual Canvas

by Yuzhao Chen | Mar 23, 2026

Authors: Yuzhao Chen, Runlin Duan, Rahul Jain, Yichen Hu, Chenfei Zhu, Jingyu Shi, Karthik Ramani

In Proceedings of the 31st International Conference on Intelligent User Interfaces

https://doi.org/10.1145/3742413.3789142

Paper

Video

Generative AI (GenAI) has significantly advanced the ease and flexibility of image creation. However, it remains a challenge to precisely control spatial compositions, including object arrangement and scene conditions. To bridge this gap, we propose Canvas3D, an interactive system leveraging a 3D engine to enable precise spatial manipulation for image generation. Upon user prompt, Canvas3D automatically converts textual descriptions into interactive objects within a 3D engine-driven virtual canvas, empowering direct and precise spatial configuration. These user-defined arrangements generate explicit spatial constraints that guide generative models in accurately reflecting user intentions in the resulting images. We conducted a closed-ended comparative study between Canvas3D and a baseline system, and an open-ended, free-form study to assess overall system usability. The results indicate that Canvas3D outperforms the baseline on spatial control, interactivity, and overall user experience.