Yinda Zhang

I am a research scientist and manager at Google. My research interests lie at the intersection of computer vision, computer graphics, and machine learning. I've been working on a broad range of topics, including 3D reconstruction, scene understanding, neural rendering, digital human, and large generative models. Recently, I am particularly interested in infusing human-centric and 3D knowledge into large generative models for consistent, realistic, and immersive content generation. At Google, I am leading a team in delivering cutting-edge immersive communication and perception technologies.

I received my Ph.D. in Computer Science from Princeton University, advised by Prof. Thomas Funkhouser. Before that, I received a Bachelor degree from Dept. Automation in Tsinghua University, and a Master degree from Dept. ECE in National University of Singapore co-supervised by Prof. Ping Tan and Prof. Shuicheng Yan.

Intern and full-time opportunities are available! Feel free to reach out!

Email:	yindanospamz (at) gmail (dot) com
Find me on:	Google Scholar GitHub LinkedIn

YES! YES! YES! YES! YES! YES↑ YES! YES! YES!

2025.12	Likeness - Google's first photorealistic avatar solution, is launched on Android XR.
2025.11	Eight papers accepted to CVPR, ICCV, TVCG, TPAMI, AAAI, ICLR in 2025.
2024.12.31	Six papers are accepted by CVPR, ECCV, and SIGGRAPH in 2024.
2024.12.5	Thrilled to see the public launch of AndroidXR (blog post) -- the platform on which we have been building cutting-edge perception features tirelessly in the past years.
2024.10.1	TC-GEN, a generative model for hurricane, is accepted by top-tier journal in earth science (JAMES). So happy to see the success of bring computer vision to environmental science. My first publication in this domain :)
2023.08	One paper is accepted by SIGGRAPH Asia.
2023.08	One paper is accepted by TPAMI.
2023.07	Five papers are accepted by ICCV2023.
2023.04	One paper is accepted by CHI 2023.
2023.02	Four papers accepted by CVPR2023.
2022.07	Four papers accepted by ECCV2022.
2022.05	Debut of AR technology from our team on Google I/O 2022.
2022.05	We released Portrait Depth API and 3D Photo live demo on Google I/O 2022. See more in TensorFlow blog post.
2022.04	Two papers are accepted by IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI).
2022.03	Two papers are accepted by ACM SIGGRAPH 2022.
2022.03	Two papers are accepted by CVPR2022.
2021.12	I am thrilled to announce the arrival of a wonderful baby boy to our home.
2021.11	One paper got accepted by AAAI2022.
2021.07	Five papers are accepted by ICCV2021.
2021.03	Five papers (2 Orals + 3 Posters) are accepted by CVPR2021.
2020.07	Three papers (2 Orals + 1 Posters) are accepted by ECCV2020.
2020.04	Our deep learning depth refinement solution is checked in Pixel4 front-facing camera to support computational photography and AR applications. Learn more here: Google AI blogpost.
2020.02	Pixel2Mesh is accepted by IEEE Transactions of Pattern Analysis and Machine Intelligence.
2020.02	Four papers are accepted by CVPR2020.
2019.12	Our deep learning based solution for image based depth estimation has been deloyed on Google Pixel4 for Portrait Mode. Check this Google AI blogpost for more details.
2019.07	Pixel2Mesh++ is accepted by ICCV2019. Check here for the paper.
2019.03	DeepLidar paper is accepted by CVPR2019.
2018.12	I started working at Google as a Research Scientist.
2018.11	I obtain Ph.D degree from Princeton University.
2018.11	I am awarded as Siebel Scholar Class of 2019.
2018.07	ActiveStereoNet is accepted as oral presentation by ECCV 2018.
2018.07	Pixel2Mesh is accepted by ECCV 2018.

Sensible Agent: A Framework for Unobtrusive Interaction with Proactive AR Agent

G. Lee, M. Xia, N. Numan, X. Qian, D. Li, Y. Chen, A. Kulshrestha, I. Chatterjee, Y. Zhang, D. Manocha, D. Kim

ACM Symposium on User Interface Software and Technology (UIST 2025)

[Paper] [Project Webpage] [Video]

InstructPipe: Building Visual Programming Pipelines in Visual Blocks with Human Instructions Using LLMs

Z. Zhou, J. Jin, V. Phadnis, X. Yuan, J. Jiang, X. Qian, J. Zhou, Y. Huang, Z. Xu, Y. Zhang, K. Wright, J. Mayes, M. Sherwood, J. Lee, A. Olwal, D. Kim, R. Iyengar, N. Li

ACM CHI Conference on Human Factors in Computing Systems (CHI 2025), Honorable Mentions Award

[Paper] [Project Webpage] [Demo]

EVER: Exact Volumetric Ellipsoid Rendering for Real-time View Synthesis

A. Mai, P. Hedman, G. Kopanas, D. Verbin, D. Futschik, Q. Xu, F. Kuester, J. Barron, Yinda Zhang

International Conference on Computer Vision (ICCV 2025, Oral)

[Paper] [Project Webpage] [Codes]

IM-Portrait: Learning 3D-aware Video Diffusion for Photorealistic Talking Heads from Monocular Videos

Y. Li^*, Z. Bai^*, F. Tan, Z. Cui, S. Fanello, Yinda Zhang

Computer Vision and Pattern Recognition (CVPR 2025)

[Paper] [Project Webpage]

SVG: 3D Stereoscopic Video Generation via Denoising Frame Matrix

P. Dai, F. Tan, Q. Xu, D. Futschik, R. Du, S. Fanello, X. Qi, Y. Zhang

The International Conference on Learning Representations (ICLR 2025)

[Paper] [Project Webpage]

GO-NeRF: Generating Objects in Neural Radiance Fields for Virtual Reality Content Creation

P. Dai, F. Tan, X. Yu, Y. Peng, Y. Zhang, X. Qi

IEEE Transactions on Visualization and Computer Graphics (TVCG)

[Paper] [Project Webpage]

TC‐GEN: Data‐Driven Tropical Cyclone Downscaling Using Machine Learning‐Based High‐Resolution Weather Model

R. Jing, J. Gao, Y. Cai, D. Xi, Y. Zhang, Y. Fu, K. Emanuel, N. Diffenbaugh, E. Bendavid

Journal of Adbances in Modeling Earth System

[Paper]

DiffGrasp: Whole-Body Grasping Synthesis Guided by Object Motion Using a Diffusion Model

Y. Zhang, Q. He, Y. Wan, Y. Zhang, X. Deng, C. Ma, H. Wang

AAAI Conference on Artificial Intelligence (AAAI2025)

HOGSA: Bimanual Hand-Object Interaction Understanding with 3D Gaussian Splatting Based Data Augmentation

W. Qu, J. Li, J. Cheng, J. Shi, C. Meng, C. Ma, H. Wang, X. Deng, Y. Zhang

AAAI Conference on Artificial Intelligence (AAAI2025)

EgoAvatar: Egocentric View-Driven and Photorealistic Full-body Avatars

J. Chen, J. Wang, Y. Zhang, R. Pandey, T. Beeler, M. Habermann, C. Theobalt

ACM SIGGRAPH ASIA 2024

[Paper] [Project Webpage]

Dynamic 3D Gaussian Prediction for Motion Extrapolation and Free View Synthesis

B. Zhao^*, Y. Li^*, Z. Sun, L. Zeng, Y. Shen, R. Ma, Y. Zhang, H. Bao, Z. Cui

ACM SIGGRAPH 2024

[Paper] [Project Webpage]

Gaussian3Diff: 3D Gaussian Diffusion for 3D Full Head Synthesis and Editing

Y. Lan, F. Tan, D. Qiu, Q. Xu, K. Genova, Z. Huang, S. Fanello, R. Pandey, T. Funkhouser, C. Loy, Y. Zhang

European Conference on Computer Vision (ECCV 2024)

[Paper] [Project Webpage]

MVDD: Multi-View Depth Diffusion Models

Z. Wang, Q. Xu, F. Tan, M. Chai, S. Liu, R. Pandey, S. Fanello, A. Kadambi, Y. Zhang

European Conference on Computer Vision (ECCV 2024)

[Paper] [Project Webpage]

GeneAvatar: Generic Expression-Aware Volumetric Head Avatar Editing from a Single Image

C. Bao^*, Y. Zhang^*, Y. Li^*, X. Zhang, B. Yang, H. Bao, M. Pollefeys, G. Zhang, Z. Cui

Computer Vision and Pattern Recognition (CVPR 2024)

[Paper] [Project Webpage] [Codes]

MonoAvatar++: Efficient 3D Implicit Head Avatar with Mesh-anchored Hash Table Blendshapes

Z. Bai, F. Tan, S. Fanello, R. Pandey, M. Dou, S. Liu, P. Tan, Y. Zhang

Computer Vision and Pattern Recognition (CVPR 2024)

[Paper] [Project Webpage]

LitNeRF: Intrinsic Radiance Decomposition for High-Quality View Synthesis and Relighting of Faces

K. Sarkar, M. Bühler, G. Li, D. Wang, D. Vicini, J. Riviere, Y. Zhang, S. Orts-Escolano1, P. Gotardo, T. Beeler, A. Meka

ACM SIGGRAPH ASIA 2023

[Paper] [Project Webpage]

DeepSFM: Robust Deep Iterative Refinement for Structure From Motion

X. Wei, Y. Zhang, X. Ren, Z. Li, Y. Fu, X. Xue

IEEE Transactions on Pattern Analysis and Machine Intelligence, 10.1109/TPAMI.2023.3307567

[Paper] [Project Webpage]

Spectral Graphormer: Spectral Graph-based Transformer for Egocentric Two-Hand Reconstruction using Multi-View Color Images

THE. Tse, F. Mueller, Z. Shen, D. Tang, T. Beeler, M. Dou, Y. Zhang, S. Petrovic, HJ. Chang, J. Taylor, B. Doosti