Ang Cao

I am a final-year Ph.D. student (2020-) in Computer Science and Engineering (CSE) at the University of Michigan, Ann Arbor, supervised by Prof. Justin Johnson and Prof. JJ (Jeong Joon) Park. I was very fortunate to be advised by Prof. Andrew Owens. I work on 3D vision: primarily how to represent, understand and generate the 3D/4D world.

Before that, I was a M.S. Student at UMich ECE (2018-2020) and I did my Bachelor's degree at Wuhan University in China (2014-2018).

I am actively looking for full-time/postdoc positions in 2025!

Email  /  CV  /  Google Scholar  /  Twitter  /  Github

profile photo
Work Experience

Research Scientist Intern, Meta FAIR, MPK, USA

Work with Sasha Sax , Franziska Meier

May 2024 - December 2024


Research Scientist Intern, Meta GenAI, London, UK

Work with David Novotny, Andrea Vedaldi , Natalia Neverova

May 2023 - November 2023


Publications

Equal contribution is indicated by * and listed in alphabetical order.

LIFT-GS: Cross-Scene Render-Supervised Distillation for 3D Language Grounding
Ang Cao, Sergio Arnaud, ... Justin Johnson, JJ Park, Alexander (Sasha) Sax
ArXiv, 2025
ArXiv/ website

We (pre-)train a 3D Visual Language Grounding (3D VLG) model with only 2D supervision, by distilling languages from 2D foundation models with render-supervision.

Probing Visual Language Priors in VLMs
Ang Cao*, Tiange Luo*, Gunhee Lee Justin Johnson*, Honglak Lee*
ArXiv, 2025 ArXiv/

How could a generator help the agent? We explore the visual language priors in VLMs by constructing novel question-image-answer triplets from image diffusion models. Proposed Image-DPO to encourage the model to use more visual inputs.

Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass
Jianing "Jed" Yang, Alexander Sax, Kevin J. Liang, Mikael Henaff, Hao Tang, Ang Cao, Joyce Chai, Franziska Meier Matt Feiszli
CVPR, 2025 ArXiv/ website/ code/ Demo/

We present a neat transformer-based 3D recon and camera pose est. pipeline, which can reconstruct 3D scenes from 1000+ images in one forward pass with ultra-high speed.

Meta 3D Gen
Meta GenAI 3DGen Team
Tech report, 2024 Paper/

A new state-of-the-art, fast pipeline for text-to-3D asset generation

Lightplane: Highly-Scalable Components for Neural 3D Fields
Ang Cao, Justin Johnson, Andrea Vedaldi, David Novotny
3DV, 2025
project page/ ArXiv/ Docs/ code

We investigate "flashattention" for NeRF: we present Lightplane Splatter and Lightplane Renderer, a pair of extremely memory efficient modules which can lift 2D images to 3D and render from theoretically any 3D hash representation with 4-5 orders of magnitude memory savings. We show its usage in a set of 3D recon and generation tasks.

DreamGaussian4D: Generative 4D Gaussian Splatting
Jiawei Ren*, Liang Pan*, Jiaxiang Tang , Chi Zhang, Ang Cao, Gang Zeng, Ziwei Liu,
ArXiv, 2023 ArXiv/ Website/ Code/

We do 4D generation with Gaussian Splatting by distilling motions from video diffusion models.

Text2Room: Extracting Textured 3D Meshes from 2D Text-to-Image Models
Ang Cao *, Lukas Höllein *, Andrew Owens, Justin Johnson, Matthias Nießner
ICCV, Oral , 2023
project page/ ArXiv/ video/ code

We generate meshes of full 3D rooms using text-to-image models.

HexPlane: A Fast Representation for Dynamic Scenes
Ang Cao, Justin Johnson
CVPR, 2023
project page/ ArXiv/ video/ code

An elegant representation for dynamic 3D scenes using six feature planes.

FWD: Real-time Novel View Synthesis with Forward Warping and Depth
Ang Cao, Chris Rockwell, Justin Johnson
CVPR, 2022
project page / ArXiv / video / code /

We show point rasterization can be really fast for sparse view novel view synthesis.

Inverting and Understanding Object Detectors
Ang Cao, Justin Johnson
Tech report, 2021
ArXiv / code /

Revealing intriguing properties of detectors by applying our layout inversion technique.

Teaching Experience
EECS 498/598: Computer Graphics and Generative Models (Fall 2024)

Teaching assistant (GSI/Head TA), working with JJ Park


Huge thanks to Jon Barron for the awesome template.