Ruohan.jpg

Ruohan Gao

Assistant Professor
Department of Computer Science, University of Maryland, College Park

Office: IRB-4248
Email: rhgao[AT]umd.edu

I am an assistant professor in the Department of Computer Science at University of Maryland, College Park, where I lead the UMD Multisensory Machine Intelligence Lab. I am also affiliated with the University of Maryland Institute for Advanced Computer Studies (UMIACS), Maryland Robotics Center (MRC), and Artificial Intelligence Interdisciplinary Institute at Maryland (AIM).

I received my Ph.D. in Computer Science from The University of Texas at Austin advised by Kristen Grauman, and then spent two years as a PostDoc at Stanford Vision and Learning Lab working with Fei-Fei Li, Jiajun Wu, and Silvio Savarese.

My research primarily focuses on computer vision and machine learning with a particular emphasis on multisensory machine intelligence involving sight, sound, and touch. The overarching goal of my research is to empower machines to emulate and enhance human capabilities in seeing, hearing, and feeling, ultimately enabling them to comprehensively perceive, understand, and interact with the multisensory world.

Prospective Students: I am always seeking self-motivated students to join my group. If you are interested, here is some more information.

Selected Publications [full list]

2025

  1. avdar_2025.png
    Differentiable Room Acoustic Rendering with Multi-View Vision Priors
    Derong Jin, and Ruohan Gao
    arXiv, 2025
  2. HAAE_cvpr2025.jpeg
    Hearing Anywhere in Any Environment
    Xiulong LiuAnurag KumarPaul Calamia, Sebastià V. Amengual Garí, Calvin Murdock, Ishwarya Ananthabhotla, Philip Robinson, Eli ShlizermanVamsi Krishna Ithapu, and Ruohan Gao
    Conference on Computer Vision and Pattern Recognition (CVPR), 2025
  3. visal_cvpr_2025.jpg
    Learning to Highlight Audio by Watching Movies
    Chao HuangRuohan Gao, J. M. F. Tsang, Jan Kurcius, Cagdas Bilen, Chenliang XuAnurag Kumar, and Sanjeel Parekh
    Conference on Computer Vision and Pattern Recognition (CVPR), 2025

2024

  1. hearing_anything_anywhere_cvpr2024.png
    Hearing Anything Anywhere
    Mason L. Wang*, Ryosuke Sawata*, Samuel ClarkeRuohan GaoShangzhe Wu, and Jiajun Wu
    Conference on Computer Vision and Pattern Recognition (CVPR), 2024

2023

  1. of_benchmark_cvpr2023.jpg
    The ObjectFolder Benchmark: Multisensory Object-Centric Learning with Neural and Real Objects
    Conference on Computer Vision and Pattern Recognition (CVPR), 2023
  2. realimpact_cvpr2023.jpg
    RealImpact: A Dataset of Impact Sound Fields for Real Objects
    Samuel ClarkeRuohan GaoMason WangMark Rau, Julia Xu, Mark RauJui-Hsien WangDoug James, and Jiajun Wu
    Conference on Computer Vision and Pattern Recognition (CVPR), 2023

2022

  1. see_hear_feel_corl2022.png
    See, Hear, and Feel: Smart Sensory Fusion for Robotic Manipulation
    Hao Li*, Yizhi Zhang*, Junzhe Zhu, Shaoxiong WangMichelle A. LeeHuazhe XuEdward AdelsonLi Fei-FeiRuohan Gao†, and Jiajun Wu†
    Conference on Robot Learning (CoRL), 2022
  2. objectfolderV2.png
    ObjectFolder 2.0: A Multisensory Object Dataset for Sim2Real Transfer
    Conference on Computer Vision and Pattern Recognition (CVPR), 2022
  3. visual_acoustic_matching_cvpr2022.png
    Visual Acoustic Matching
    Changan ChenRuohan GaoPaul Calamia, and Kristen Grauman
    Conference on Computer Vision and Pattern Recognition (CVPR), 2022

2021

  1. bmvc2021.png
    Geometry-Aware Multi-Task Learning for Binaural Audio Generation from Video
    Rishabh GargRuohan Gao, and Kristen Grauman
    British Machine Vision Conference (BMVC), 2021
  2. thesis_teaser.png
    Look and Listen: From Semantic to Spatial Audio-Visual Perception
    Ruohan Gao
    Ph.D. Dissertation, 2021

2019

  1. 2.5D_visual_sound_cvpr2019.png
    2.5D Visual Sound
    Ruohan Gao, and Kristen Grauman
    Conference on Computer Vision and Pattern Recognition (CVPR), 2019

2018

  1. audioobjects_eccv2018.png
    Learning to Separate Object Sounds by Watching Unlabeled Video
    Ruohan GaoRogerio Feris, and Kristen Grauman
    European Conference on Computer Vision (ECCV), 2018