
Ruohan Gao
Assistant Professor
Department of Computer Science, University of Maryland, College Park
Office: IRB-4248
Email: rhgao[AT]umd.edu
I am an assistant professor in the Department of Computer Science at University of Maryland, College Park, where I lead the UMD Multisensory Machine Intelligence Lab. I am also affiliated with the University of Maryland Institute for Advanced Computer Studies (UMIACS), Maryland Robotics Center (MRC), and Artificial Intelligence Interdisciplinary Institute at Maryland (AIM).
I received my Ph.D. in Computer Science from The University of Texas at Austin advised by Kristen Grauman, and then spent two years as a PostDoc at Stanford Vision and Learning Lab working with Fei-Fei Li, Jiajun Wu, and Silvio Savarese.
My research primarily focuses on computer vision and machine learning with a particular emphasis on multisensory machine intelligence involving sight, sound, and touch. The overarching goal of my research is to empower machines to emulate and enhance human capabilities in seeing, hearing, and feeling, ultimately enabling them to comprehensively perceive, understand, and interact with the multisensory world.
Prospective Students: I am always seeking self-motivated students to join my group. If you are interested, here is some more information.
Selected Publications [full list]
2025
- Differentiable Room Acoustic Rendering with Multi-View Vision PriorsarXiv, 2025
- Learning to Highlight Audio by Watching MoviesConference on Computer Vision and Pattern Recognition (CVPR), 2025
2024
2023
- The ObjectFolder Benchmark: Multisensory Object-Centric Learning with Neural and Real ObjectsConference on Computer Vision and Pattern Recognition (CVPR), 2023
2022
- See, Hear, and Feel: Smart Sensory Fusion for Robotic ManipulationConference on Robot Learning (CoRL), 2022
- ObjectFolder 2.0: A Multisensory Object Dataset for Sim2Real TransferConference on Computer Vision and Pattern Recognition (CVPR), 2022
- Visual Acoustic MatchingConference on Computer Vision and Pattern Recognition (CVPR), 2022
2021
- Geometry-Aware Multi-Task Learning for Binaural Audio Generation from VideoBritish Machine Vision Conference (BMVC), 2021
- Look and Listen: From Semantic to Spatial Audio-Visual PerceptionPh.D. Dissertation, 2021
2019
- 2.5D Visual SoundConference on Computer Vision and Pattern Recognition (CVPR), 2019