
Ruohan Gao
Assistant Professor
Department of Computer Science, University of Maryland, College Park
Office: IRB-4248
Email: rhgao[AT]umd.edu
I am an assistant professor in the Department of Computer Science at University of Maryland, College Park, where I lead the UMD Multisensory Machine Intelligence Lab. I am also affiliated with the University of Maryland Institute for Advanced Computer Studies (UMIACS), Maryland Robotics Center (MRC), and Artificial Intelligence Interdisciplinary Institute at Maryland (AIM).
I received my Ph.D. in Computer Science from The University of Texas at Austin advised by Kristen Grauman, and then spent two years as a PostDoc at Stanford Vision and Learning Lab working with Fei-Fei Li, Jiajun Wu, and Silvio Savarese.
My research primarily focuses on computer vision and machine learning with a particular emphasis on multisensory machine intelligence involving sight, sound, and touch. The overarching goal of my research is to empower machines to emulate and enhance human capabilities in seeing, hearing, and feeling, ultimately enabling them to comprehensively perceive, understand, and interact with the multisensory world.
Prospective Students: I am always seeking self-motivated students to join my group. If you are interested, here is some more information.
News
Selected for AAAI New Faculty Highlights 2025. | |
I will be joining the Department of Computer Science at University of Maryland, College Park (UMD) as an Assistant Professor late 2024. | |
I serve as an Area Chair for ICCV 2023, 3DV 2025, and a SPC for AAAI 2023, 2024, 2025. | |
We are organizing the Sight and Sound Workshop at CVPR 2024. | |
We are organizing the AV4D Workshop at ICCV 2023. | |
We are organizing the Creative AI Across Modalities Workshop at AAAI 2023. | |
We are organizing the Embodied Multimodal Learning Workshop at ICLR 2021. | |
I am very honored to have received the Michael H. Granof Award that recognizes UT Austin’s Top 1 Doctoral Dissertation of 2021. |
Selected Publications [full list]
2025
- Hearing Anywhere in Any EnvironmentConference on Computer Vision and Pattern Recognition (CVPR), 2025
- Learning to Highlight Audio by Watching MoviesConference on Computer Vision and Pattern Recognition (CVPR), 2025
2024
2023
- The ObjectFolder Benchmark: Multisensory Object-Centric Learning with Neural and Real ObjectsConference on Computer Vision and Pattern Recognition (CVPR), 2023
2022
- See, Hear, and Feel: Smart Sensory Fusion for Robotic ManipulationConference on Robot Learning (CoRL), 2022
- ObjectFolder 2.0: A Multisensory Object Dataset for Sim2Real TransferConference on Computer Vision and Pattern Recognition (CVPR), 2022
- Visual Acoustic MatchingConference on Computer Vision and Pattern Recognition (CVPR), 2022
2021
- Geometry-Aware Multi-Task Learning for Binaural Audio Generation from VideoBritish Machine Vision Conference (BMVC), 2021
- Look and Listen: From Semantic to Spatial Audio-Visual PerceptionPh.D. Dissertation, 2021
2019
- 2.5D Visual SoundConference on Computer Vision and Pattern Recognition (CVPR), 2019