Yunhao (Andy) Ge     葛云皓Research Scientist @ NVIDIA
Email: yunhaog at nvidia dot com     [About Me] [News] [Publications] [Experience] |
|
I am a Research Scientist at NVIDIA Research, working on Generative AI. I received my Ph.D. in Computer Science from University of Southern California advised by Prof. Laurent Itti, and was honored with the Amazon ML Fellowship. I was a Visiting Ph.D. Student at Stanford Vision and Learning Lab (SVL) advised by Prof. Jiajun Wu. I have broad research interests in Computer Vision and Natural Language Processing, with a recent focus on building foundation models for Multimodal-Large Language Models and Text-guided 2D/3D Generation.
                                             
[2024/5/15] One paper, BEHAVIOR Vision Suite for Customizable Dataset Generation, is accepted by CVPR 2024 (Highlight). Code released.
[2024/5/1] One paper, LLM Agent Tool Use for 2D/3D captioning, is accepted by CVPR 2024.
[2023/12/21] We release the paper and code of DreamDistribution for personalized 2D/3D generation.
[2023/12/18] Starting a new journey at NVIDIA Research as a Research Scientist.
[2023/09/23] One paper on 3D Copy-Paste is accepted by NeurIPS 2023.
[2023/07/26] One paper on Lifelong (Continual) Learning is accepted by ICCV 2023.
[2023/05/09] One paper on Shared Knowledge Lifelong Learning, a new Lifelong Learning paradigm, is accepted by TMLR.
[2023/02/27] One paper on Multi-modal models' Robustness and Generalization is accepted by CVPR 2023.
[2022/12/01] Starting a new journey at Stanford Vision and Learning (SVL) Lab as a Visiting Student Researcher, advised by Prof. Jiajun Wu.
[2022/08/16] I was awarded the Amazon ML Fellowship (2022-2023), and will be an Amazon Fellow at USC + Amazon Center on Secure & Trusted Machine Learning. Thank you Amazon!
[2022/08/16] One paper on Disentangled and Convex Representation learning is accepted by WACV 2023, code is coming soon.
[2022/07/03] Two papers on NeRF and Humanoid Neural Network are accepted by ECCV 2022, code are released.
[2022/05/31] I will be joining Google Research as a student researcher, advised by Dr. Jiaping Zhao , Dr. Jie Ren , Dr. Balaji Lakshminarayanan and Prof. Ming-Hsuan Yang.
[2022/01/20] Finally passed my qual exam and officially became a PhD Candidate now.
[2021/08/23] I will be joining Google Cloud AI as a student researcher, advised by Dr. Sercan Arik and Dr. Jinsung Yoon
[2021/07/15] USC News , Tech Xplore , Technology Networks and other media pressed our ICLR 2021 paper: Group-Supervised Learning (Enabling the 'imagination' of artificial intelligence)
[2021/05/17] I will be joining Computer Vision Group at Microsoft Research Redmond as a research intern in summer 2021, advised by Dr. Vibhav Vineet and Dr. Neel Joshi
[2021/04/07] Releasing Img2SceneGraph, a pipeline that transfers images to scene graphs with node attributes! Welcome to Download and try!
[2021/04/02] One paper (Graph Autoencoder for Graph Compression and Representation Learning) was accepted by Neural Compression Workshop @ICLR 2021 as Spotlight!
[2021/02/28] One paper (A Peek Into the Reasoning of Neural Networks: Interpreting with Structural Visual Concepts) was accepted by CVPR 2021!
[2021/01/16] One paper (Beneficial Perturbation Network for designing general adaptive artificial intelligence systems) was accepted by TNNLS!
[2021/01/12] One paper (Zero-shot Synthesis with Group-Supervised Learning) was accepted by ICLR 2021!
[2020/09/14] Fonts dataset was proposed for fast testing and idea iteration on disentangled representation learning and zero-shot synthesis. Welcome to Download and try!
[2020/07/02] One paper (Pose Augmentation: Class-agnostic Object Pose Transformation) was accepted by ECCV 2020!
[2020/05/12] I will be joining UII America as a research intern in summer 2020, advised by Dr. Ziyan Wu and Dr. Srikrishna Karanam
[2019/08/12] I will be joining USC CS Ph.D. Program in fall 2019, advised by Prof. Laurent Itti.
[2019/07/01] One paper (Synthesis and inpainting-based MR-CT registration) was accepted by MICCAI 2019.
[2019/03/01] One paper (Unpaired Whole-Body Mr to CT Synthesis) was accepted by ISBI 2019.
BEHAVIOR Vision Suite: Customizable Dataset Generation via Simulation Yunhao Ge*, Yihe Tang*, Jiashu Xu*, Cem Gokmen*, Chengshu Li, Wensi Ai, Benjamin Jose Martinez, Arman Aydin, Mona Anvari, Ayush K Chakravarthy, Hong-Xing Yu, Josiah Wong, Sanjana Srivastava, Sharon Lee, Shengxin Zha, Laurent Itti, Yunzhu Li, Roberto Martín-Martín, Miao Liu, Pengchuan Zhang, Ruohan Zhang, Li Fei-Fei, Jiajun Wu (*=equal contribution) CVPR 2024 (IEEE Conference on Computer Vision and Pattern Recognition). [paper] [code] [project page] [tools] Highlight |
|
Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation Yunhao Ge, Xiaohui Zeng, Jacob Samuel Huffman, Tsung-Yi Lin, Ming-Yu Liu, Yin Cui CVPR 2024 (IEEE Conference on Computer Vision and Pattern Recognition). [paper] [video] [project page] |
|
DreamDistribution: Prompt Distribution Learning for Text-to-Image Diffusion Models Brian Nlong Zhao, Yuhang Xiao*, Jiashu Xu*, Xinyang Jiang, Yifan Yang, Dongsheng Li, Laurent Itti, Vibhav Vineet†, Yunhao Ge† (*=co-second author, †=equal contribution) arxiv:2312.14216, 2023. [paper] [code] [project page] |
|
3D Copy-Paste: Physically-Plausible Object Insertion for Monocular 3D Detection Yunhao Ge, Hong-Xing Yu, Cheng Zhao, Yuliang Guo, Xinyu Huang, Liu Ren, Laurent Itti, Jiajun Wu NeurIPS 2023 (Advances in Neural Information Processing Systems). [paper] [code] [project page] |
|
DALL-E for Detection: Language-driven Compositional Image Synthesis for Object Detection Beyond Generation: Harnessing Text to Image Models for Object Detection and Segmentation Yunhao Ge*, Jiashu Xu*, Brian Nlong Zhao, Neel Joshi, Laurent Itti, Vibhav Vineet (*=equal contribution) arXiv:2206.09592, 2022. [paper(Beyond Generation)] [paper(DALL-E for Detection)] [code] |
|
CLR: Channel-wise Lightweight Reprogramming for Continual Learning Yunhao Ge, Yuecheng Li*, Shuo Ni*, Jiaping Zhao, Ming-Hsuan Yang, Laurent Itti (*=equal contribution as second author) ICCV 2023 (International Conference on Computer Vision). [paper] [code] [SKILL-102 Dataset] |
|
Lightweight Learner for Shared Knowledge Lifelong Learning Yunhao Ge, Yuecheng Li*, Di Wu*, Ao Xu*, Adam M. Jones, Amanda Sofie Rios, Iordanis Fostiropoulos, Shixian wen, Po-Hsuan Huang, Zachary William Murdock, Gozde Sahin, Shuo Ni, Kiran Lekkala, Sumedh Anand Sontakke, Laurent Itti (*=equal contribution as second author) TMLR (Transactions on Machine Learning Research). [paper] [code] [project page] [SKILL-102 Dataset] [USC Viterbi Press] |
|
Building One-class Detector for Anything: Open-vocabulary Zero-shot OOD Detection Using Text-image Models Yunhao Ge*, Jie Ren*, Jiaping Zhao, Kaifeng Chen, Andrew Gallagher, Laurent Itti, and Balaji Lakshminarayanan (*=equal contribution) Knowledge and Logical Reasoning workshop @ ICML 2023 [paper] |
|
Improving Zero-shot Generalization and Robustness of Multi-modal Models Yunhao Ge*, Jie Ren*, Andrew Gallagher, Yuxiao Wang, Ming-Hsuan Yang, Hartwig Adam, Laurent Itti, Balaji Lakshminarayanan, and Jiaping Zhao (*=equal contribution) CVPR 2023 (IEEE/ CVF International Conference on Computer Vision and Pattern Recognition). [paper] [code] [project page] |
|
Neural-Sim: Learning to Generate Training Data with NeRF Yunhao Ge, Harkirat Behl*, Jiashu Xu*, Suriya Gunasekar, Neel Joshi, Yale Song, Xin Wang, Laurent Itti, and Vibhav Vineet (*=equal contribution as second author) ECCV 2022 (European Conference on Computer Vision). |
|
Contributions of Shape, Texture, and Color in Visual Recognition Yunhao Ge*, Yao Xiao*, Zhi Xu, Xingrui Wang, Laurent Itti (*=equal contribution) ECCV 2022 (European Conference on Computer Vision). |
|
A Peek Into the Reasoning of Neural Networks: Interpreting with Structural Visual Concepts Yunhao Ge, Yao Xiao, Zhi Xu, Meng Zheng, Srikrishna Karanam, Terrence Chen, Laurent Itti and Ziyan Wu CVPR 2021 (IEEE/ CVF International Conference on Computer Vision and Pattern Recognition). [paper] [github] [project page] [video] [知乎] [机器之心] [AI科技评论] |
|
Zero-shot Synthesis with Group-Supervised Learning Yunhao Ge, Sami Abu-El-Haija, Gan Xin and Laurent Itti ICLR 2021 (International Conference on Learning Representations). [paper]
[code]
[project page]
[Fonts Dataset]
[USC Viterbi Press]
[知乎]
[AI科技评论] |
|
Pose Augmentation: Class-agnostic Object Pose Transformation for Object Recognition Yunhao Ge, Jiaping Zhao, Laurent Itti ECCV 2020 (European Conference on Computer Vision). [paper] [github] [video-1min] [video-10min] |
|
Unpaired MR to CT Synthesis with Explicit Structural Constrained Adversarial Learning Yunhao Ge*, Dongming Wei*, Zhong Xue, Yiqiang Zhan, Xiang Zhou, Qian Wang and Shu Liao (*=equal contribution) ISBI 2019 (IEEE International Symposium on Biomedical Imaging). |
|
Unpaired Whole-body MR to CT Synthesis with Correlation Coefficient Constrained Adversarial Learning Yunhao Ge, Zhong Xue, Yiqiang Zhan, Xiang Zhou and Shu Liao SPIE 2019 (SPIE-Medical Imaging). Oral Presentation |
|
Last update: May 15, 2024