Ang Cao
I am a final-year Ph.D. student (2020-) in Computer Science and Engineering (CSE) at the University of Michigan, Ann Arbor, supervised by Prof. Justin Johnson and Prof. JJ (Jeong Joon) Park.
I was very fortunate to be advised by Prof. Andrew Owens.
I work on 3D vision: primarily how to represent, understand and generate the 3D/4D world.
Before that, I was a M.S. Student at UMich ECE (2018-2020) and I did my Bachelor's degree at Wuhan University in China (2014-2018).
I am actively looking for full-time/postdoc positions in 2025!
Email  / 
CV  / 
Google Scholar  / 
Twitter  / 
Github
|
|
Publications Equal contribution is indicated by * and listed in alphabetical order.
|
|
LIFT-GS: Cross-Scene Render-Supervised Distillation for 3D Language Grounding
Ang Cao,
Sergio Arnaud, ...
Justin Johnson,
JJ Park,
Alexander (Sasha) Sax
ArXiv, 2025
ArXiv/
website
We (pre-)train a 3D Visual Language Grounding (3D VLG) model with only 2D supervision, by distilling languages from 2D foundation models with render-supervision.
|
|
Probing Visual Language Priors in VLMs
Ang Cao*,
Tiange Luo*,
Gunhee Lee
Justin Johnson*,
Honglak Lee*
ArXiv, 2025
ArXiv/
How could a generator help the agent? We explore the visual language priors in VLMs by constructing novel question-image-answer triplets from image diffusion models. Proposed Image-DPO to encourage the model to use more visual inputs.
|
|
Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass
Jianing "Jed" Yang,
Alexander Sax,
Kevin J. Liang,
Mikael Henaff,
Hao Tang,
Ang Cao,
Joyce Chai,
Franziska Meier
Matt Feiszli
CVPR, 2025
ArXiv/
website/
code/
Demo/
We present a neat transformer-based 3D recon and camera pose est. pipeline, which can reconstruct 3D scenes from 1000+ images in one forward pass with ultra-high speed.
|
|
Meta 3D Gen
Meta GenAI 3DGen Team
Tech report, 2024
Paper/
A new state-of-the-art, fast pipeline for text-to-3D asset generation
|
|
Lightplane: Highly-Scalable Components for Neural 3D Fields
Ang Cao,
Justin Johnson,
Andrea Vedaldi,
David Novotny
3DV, 2025
project page/
ArXiv/
Docs/
code
We investigate "flashattention" for NeRF: we present Lightplane Splatter and Lightplane Renderer, a pair of extremely memory efficient modules which can lift 2D images to 3D and render from theoretically any 3D hash representation with 4-5 orders of magnitude memory savings.
We show its usage in a set of 3D recon and generation tasks.
|
|
DreamGaussian4D: Generative 4D Gaussian Splatting
Jiawei Ren*,
Liang Pan*,
Jiaxiang Tang ,
Chi Zhang,
Ang Cao,
Gang Zeng,
Ziwei Liu,
ArXiv, 2023
ArXiv/
Website/
Code/
We do 4D generation with Gaussian Splatting by distilling motions from video diffusion models.
|
|
Text2Room: Extracting Textured 3D Meshes from 2D Text-to-Image Models
Ang Cao *,
Lukas Höllein *,
Andrew Owens,
Justin Johnson,
Matthias Nießner
ICCV, Oral , 2023
project page/
ArXiv/
video/
code
We generate meshes of full 3D rooms using text-to-image models.
|
|
HexPlane: A Fast Representation for Dynamic Scenes
Ang Cao,
Justin Johnson
CVPR, 2023
project page/
ArXiv/
video/
code
An elegant representation for dynamic 3D scenes using six feature planes.
|
|
FWD: Real-time Novel View Synthesis with Forward Warping and Depth
Ang Cao,
Chris Rockwell,
Justin Johnson
CVPR, 2022
project page /
ArXiv /
video /
code /
We show point rasterization can be really fast for sparse view novel view synthesis.
|
|
Inverting and Understanding Object Detectors
Ang Cao,
Justin Johnson
Tech report, 2021
ArXiv /
code /
Revealing intriguing properties of detectors by applying our layout inversion technique.
|
|