Yunhao (Andy) Ge

[2025/03/18] We have released Cosmos Transfer1, a world-to-world transfer model designed to bridge the perceptual divide between simulated and real-world environments!

[2025/02/07] One paper, DreamDistribution for generating diverse customized images, is accepted by ICLR 2025. Code released.

[2025/01/06] We have released Cosmos, a world foundation model platform for physical AI!

[2024/11/11] We have released Edify 3D, a cutting-edge technology for 3D generation!

[2024/11/11] We have released Edify Image, a family of diffusion models for photorealistic image generation!

[2024/7/30] GenUSD is proudly featured in the Real-Time Live! at SIGGRAPH 2024 .

[2024/5/15] One paper, BEHAVIOR Vision Suite for Customizable Dataset Generation, is accepted by CVPR 2024 (Highlight). Code released.

[2024/5/1] One paper, LLM Agent Tool Use for 2D/3D captioning, is accepted by CVPR 2024.

[2023/12/21] We release the paper and code of DreamDistribution for personalized 2D/3D generation.

[2023/12/18] Starting a new journey at NVIDIA Research as a Research Scientist.

[2023/09/23] One paper on 3D Copy-Paste is accepted by NeurIPS 2023.

[2023/07/26] One paper on Lifelong (Continual) Learning is accepted by ICCV 2023.

[2023/05/09] One paper on Shared Knowledge Lifelong Learning, a new Lifelong Learning paradigm, is accepted by TMLR.

[2023/02/27] One paper on Multi-modal models' Robustness and Generalization is accepted by CVPR 2023.

[2022/12/01] Starting a new journey at Stanford Vision and Learning (SVL) Lab as a Visiting Student Researcher, advised by Prof. Jiajun Wu.

[2022/08/16] I was awarded the Amazon ML Fellowship (2022-2023), and will be an Amazon Fellow at USC + Amazon Center on Secure & Trusted Machine Learning. Thank you Amazon!

[2022/08/16] One paper on Disentangled and Convex Representation learning is accepted by WACV 2023, code is coming soon.

[2022/07/03] Two papers on NeRF and Humanoid Neural Network are accepted by ECCV 2022, code are released.

[2022/05/31] I will be joining Google Research as a student researcher, advised by Dr. Jiaping Zhao , Dr. Jie Ren , Dr. Balaji Lakshminarayanan and Prof. Ming-Hsuan Yang.

[2022/01/20] Finally passed my qual exam and officially became a PhD Candidate now.

[2021/08/23] I will be joining Google Cloud AI as a student researcher, advised by Dr. Sercan Arik and Dr. Jinsung Yoon

[2021/07/15] USC News , Tech Xplore , Technology Networks and other media pressed our ICLR 2021 paper: Group-Supervised Learning (Enabling the 'imagination' of artificial intelligence)

[2021/05/17] I will be joining Computer Vision Group at Microsoft Research Redmond as a research intern in summer 2021, advised by Dr. Vibhav Vineet and Dr. Neel Joshi

[2021/04/07] Releasing Img2SceneGraph, a pipeline that transfers images to scene graphs with node attributes! Welcome to Download and try!

[2021/04/02] One paper (Graph Autoencoder for Graph Compression and Representation Learning) was accepted by Neural Compression Workshop @ICLR 2021 as Spotlight!

[2021/02/28] One paper (A Peek Into the Reasoning of Neural Networks: Interpreting with Structural Visual Concepts) was accepted by CVPR 2021!

[2021/01/16] One paper (Beneficial Perturbation Network for designing general adaptive artificial intelligence systems) was accepted by TNNLS!

[2021/01/12] One paper (Zero-shot Synthesis with Group-Supervised Learning) was accepted by ICLR 2021!

[2020/09/14] Fonts dataset was proposed for fast testing and idea iteration on disentangled representation learning and zero-shot synthesis. Welcome to Download and try!

[2020/07/02] One paper (Pose Augmentation: Class-agnostic Object Pose Transformation) was accepted by ECCV 2020!

[2020/05/12] I will be joining UII America as a research intern in summer 2020, advised by Dr. Ziyan Wu and Dr. Srikrishna Karanam

[2019/08/12] I will be joining USC CS Ph.D. Program in fall 2019, advised by Prof. Laurent Itti.

[2019/07/01] One paper (Synthesis and inpainting-based MR-CT registration) was accepted by MICCAI 2019.

[2019/03/01] One paper (Unpaired Whole-Body Mr to CT Synthesis) was accepted by ISBI 2019.

	Cosmos-Transfer1: Conditional World Generation with Adaptive Multimodal Control Core contributor White Paper. [paper] [project page] [code] [huggingface] [video]
	Cosmos: World Foundation Model Platform for Physical AI Core contributor White Paper. [paper] [project page] [code] [huggingface] [video]
	Edify 3D: Scalable High-Quality 3D Asset Generation Core contributor White Paper. [paper] [project page] [video]
	Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models Core contributor White Paper. [paper] [project page] [video]
	GenUSD: 3D scene generation made easy Core contributor SIGGRAPH 2024. [paper] [project page]
	BEHAVIOR Vision Suite: Customizable Dataset Generation via Simulation Yunhao Ge^, Yihe Tang^, Jiashu Xu^, Cem Gokmen^, Chengshu Li, Wensi Ai, Benjamin Jose Martinez, Arman Aydin, Mona Anvari, Ayush K Chakravarthy, Hong-Xing Yu, Josiah Wong, Sanjana Srivastava, Sharon Lee, Shengxin Zha, Laurent Itti, Yunzhu Li, Roberto Martín-Martín, Miao Liu, Pengchuan Zhang, Ruohan Zhang, Li Fei-Fei, Jiajun Wu (=equal contribution) CVPR 2024* (IEEE Conference on Computer Vision and Pattern Recognition). [paper] [code] [project page] [tools] Highlight
	Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation Yunhao Ge, Xiaohui Zeng, Jacob Samuel Huffman, Tsung-Yi Lin, Ming-Yu Liu, Yin Cui CVPR 2024 (IEEE Conference on Computer Vision and Pattern Recognition). [paper] [video] [project page]
	DreamDistribution: Prompt Distribution Learning for Text-to-Image Diffusion Models Brian Nlong Zhao, Yuhang Xiao^, Jiashu Xu^, Xinyang Jiang, Yifan Yang, Dongsheng Li, Laurent Itti, Vibhav Vineet^†, Yunhao Ge^† (=co-second author, †=equal contribution) ICLR 2025*. [paper] [code] [project page]
	3D Copy-Paste: Physically-Plausible Object Insertion for Monocular 3D Detection Yunhao Ge, Hong-Xing Yu, Cheng Zhao, Yuliang Guo, Xinyu Huang, Liu Ren, Laurent Itti, Jiajun Wu NeurIPS 2023 (Advances in Neural Information Processing Systems). [paper] [code] [project page]
	DALL-E for Detection: Language-driven Compositional Image Synthesis for Object Detection Beyond Generation: Harnessing Text to Image Models for Object Detection and Segmentation Yunhao Ge^, Jiashu Xu^, Brian Nlong Zhao, Neel Joshi, Laurent Itti, Vibhav Vineet (=equal contribution) arXiv:2206.09592, 2022.* [paper(Beyond Generation)] [paper(DALL-E for Detection)] [code]
	CLR: Channel-wise Lightweight Reprogramming for Continual Learning Yunhao Ge, Yuecheng Li^, Shuo Ni^, Jiaping Zhao, Ming-Hsuan Yang, Laurent Itti (=equal contribution as second author) ICCV 2023* (International Conference on Computer Vision). [paper] [code] [SKILL-102 Dataset]
	Lightweight Learner for Shared Knowledge Lifelong Learning Yunhao Ge, Yuecheng Li^, Di Wu^, Ao Xu^, Adam M. Jones, Amanda Sofie Rios, Iordanis Fostiropoulos, Shixian wen, Po-Hsuan Huang, Zachary William Murdock, Gozde Sahin, Shuo Ni, Kiran Lekkala, Sumedh Anand Sontakke, Laurent Itti (=equal contribution as second author) TMLR (Transactions on Machine Learning Research). [paper] [code] [project page] [SKILL-102 Dataset] [USC Viterbi Press]
	Improving Zero-shot Generalization and Robustness of Multi-modal Models Yunhao Ge^, Jie Ren^, Andrew Gallagher, Yuxiao Wang, Ming-Hsuan Yang, Hartwig Adam, Laurent Itti, Balaji Lakshminarayanan, and Jiaping Zhao (=equal contribution) CVPR 2023* (IEEE/ CVF International Conference on Computer Vision and Pattern Recognition). [paper] [code] [project page]
	Neural-Sim: Learning to Generate Training Data with NeRF Yunhao Ge, Harkirat Behl^, Jiashu Xu^, Suriya Gunasekar, Neel Joshi, Yale Song, Xin Wang, Laurent Itti, and Vibhav Vineet (=equal contribution as second author) ECCV 2022* (European Conference on Computer Vision). [paper] [code]
	Contributions of Shape, Texture, and Color in Visual Recognition Yunhao Ge^, Yao Xiao^, Zhi Xu, Xingrui Wang, Laurent Itti (=equal contribution) ECCV 2022* (European Conference on Computer Vision). [paper] [code]
	A Peek Into the Reasoning of Neural Networks: Interpreting with Structural Visual Concepts Yunhao Ge, Yao Xiao, Zhi Xu, Meng Zheng, Srikrishna Karanam, Terrence Chen, Laurent Itti and Ziyan Wu CVPR 2021 (IEEE/ CVF International Conference on Computer Vision and Pattern Recognition). [paper] [github] [project page] [video] [知乎] [机器之心] [AI科技评论]
	Zero-shot Synthesis with Group-Supervised Learning Yunhao Ge, Sami Abu-El-Haija, Gan Xin and Laurent Itti ICLR 2021 (International Conference on Learning Representations). [paper] [code] [project page] [Fonts Dataset] [USC Viterbi Press] [知乎] [AI科技评论] [ USC News ] [ Tech Xplore ] [ Technology Networks ]
	Pose Augmentation: Class-agnostic Object Pose Transformation for Object Recognition Yunhao Ge, Jiaping Zhao, Laurent Itti ECCV 2020 (European Conference on Computer Vision). [paper] [github] [video-1min] [video-10min]

Yunhao (Andy) Ge 葛云皓

Research Scientist @ NVIDIA

About Me

News & Updates

Selected Publications [Google Scholar]

Intern & Work Experience

Google Research, Los Angeles, USA (May. 2022 - Dec. 2022)

Google Cloud AI, Mountain View, USA (Aug. 2021 - May 2022)

Microsoft Research, Redmond, USA (May. 2021 - Aug. 2021)

UII America, Inc, Boston, USA (May. 2020 - Aug. 2020)

Flexiv Ltd, Shanghai, China (May. 2019 - Aug. 2019)

United Imaging Intelligence Co., Ltd, Shanghai, China (Jun. 2018 - Apr. 2019)

Scholarships

Academic Service