PhD (CUHK), MS (UCAS), BEng (Xidian)
Senior researcher
Huawei Noah's Ark Lab
Email: zhensongzhang[at]hotmail[dot]com
I joined Huawei Noah's Ark Lab after I obtained my Ph.D. degree from The Chinese University of Hong Kong in 2018. Before that, I received a BEng. degree and a M.S. degree from Xidian University and University of Chinese Academy of Sciences in 2011 and 2014, respectively. I am currently working on Visual Language Model/AIGC-based image editing.
We are looking for self-motivated interns and full times, if you are insterested in doing cool MLLM/AIGC projects, welcome to join us, please drop me an email.
[07/2025] One paper is accepted to ICCV 2025, congrats to all coauthors!
[06/2025] We won the second place award in HD-EPIC VQA Challenges 2025, congrats to all coauthors!
[04/2025] One paper is accepted to IJCAI 2025, congrats to all coauthors!
[03/2025] One paper is accepted to CVPR 2025, congrats to all coauthors!
[01/2025] One paper is accepted to ICASSP 2025, congrats to all coauthors!
[09/2024] One paper is accepted to NeurIPS 2024, congrats to all coauthors!
[02/2024] Three papers are accepted to CVPR 2024, congrats to all coauthors!
[12/2023] One paper is accepted to ICASSP 2024.
[10/2023] We won the Reproducibility Award in GENEA Challenge 2023.
[09/2023] Our work on robust monocular depth estimation is accepted to IJCV.
[07/2023] Our UnifiedGesture is accepted to ACM MM 2023.
[05/2023] Our DiffuseStyleGesture is accepted to IJCAI 2023.
[05/2023] Our QPGesture is accepted to CVPR 2023 as a highlight.
[11/2022] Our sign language avatar appears on HDC 2022 / HC 2022 and helps translate Chinese keynotes into CSL, 量子位, 华为人, 华为开发者联盟服务.
[10/2022] Our joint team Megatron_RVC won the RVC 2022 single image depth prediction challenge, news
[8/2022] Our human pose and shape estimation paper CLIFF is accepted to ECCV as oral presentation, news
[Preprint] ViDAR: Video Diffusion-Aware 4D Reconstruction From Monocular Inputs
Michal Nazarczuk, Sibi Catley-Chandar, Thomas Tanay, Zhensong Zhang, Gregory Slabaugh, Eduardo Pérez-Pellitero
[Preprint] GASPACHO: Gaussian Splatting for Controllable Humans and Objects
Aymen Mir, Arthur Moreau, Helisa Dhamo, Zhensong Zhang, Eduardo Pérez-Pellitero
[Preprint] Better Together: Unified Motion Capture and 3D Avatar Reconstruction
Arthur Moreau, Mohammed Brahimi, Richard Shaw, Athanasios Papaioannou, Thomas Tanay, Zhensong Zhang, Eduardo Pérez-Pellitero
[Preprint] Human Motion Video Generation: A Survey
Haiwei Xue, Xiangyang Luo, Zhanghao Hu, Xin Zhang, Xunzhi Xiang, Yuqin Dai, Jianzhuang Liu, Zhensong Zhang, Minglei Li, Jian Yang, Fei Ma, Zhiyong Wu, Changpeng Yang, Zonghong Dai, Fei Richard Yu
[Website]
[ICCV] Frequency-Guided Diffusion for Training-Free Text-Driven Image Translation
Zheng Gao, Jifei Song, Zhensong Zhang, Jiankang Deng, Ioannis Patras
In ICCV 2025.
[CVPR] CaricatureBooth: Data-Free Interactive Caricature Generation in a Photo Booth
Zhiyu Qu, Yunqi Miao, Zhensong Zhang, Jifei Song, Jiankang Deng, Yi-Zhe Song
In CVPR 2025.
[IJCAI] VideoHumanMIB: Unlocking Appearance Decoupling for Video Human Motion In-betweening
Haiwei Xue, Zhensong Zhang, Minglei Li, Zonghong Dai, Fei Yu, Fei Ma, Zhiyong Wu
In IJCAI 2025.
[ICASSP] Identity-Preserving Audio-Driven Holistic Human Motion Video Generation
Haiwei Xue, Zhensong Zhang, Minglei Li, Zonghong Dai
In ICASSP 2025.
[NeurIPS] SCRREAM : SCan, Register, REnder And Map: A Framework for Annotating Accurate and Dense 3D Indoor Scenes with a Benchmark
HyunJun Jung, Weihang Li, Shun-Cheng Wu, William Bittner, Nikolas Brasch, Jifei Song, Eduardo Pérez-Pellitero, Zhensong Zhang, Arthur Moreau, Nassir Navab, Benjamin Busam
In NeurIPS 2024.
[Dataset]
[CVPR] Co-Speech Gesture Video Generation via Motion-Decoupled Diffusion Model
Xu He, Qiaochu Huang, Zhensong Zhang, Zhiwei Lin, Zhiyong Wu, Sicheng Yang, Minglei Li, Zhiyi Chen, Songcen Xu, Xiaofei Wu
In CVPR 2024.
[Code]
[CVPR] Semantics-aware Motion Retargeting with Vision-Language Models
Haodong Zhang, Zhike Chen, Haocheng Xu, Lei Hao, Xiaofei Wu, Songcen Xu, Zhensong Zhang, Yue Wang, Rong Xiong
In CVPR 2024.
[Code]
[CVPR] Low-Res Leads the Way: Improving Generalization for Super-Resolution by Self-Supervised Learning
Haoyu Chen, Wenbo Li, Jinjin Gu, Jingjing Ren, Haoze Sun, Xueyi Zou, Youliang Yan, Zhensong Zhang, Lei Zhu
In CVPR 2024.
[Website]
[ICASSP] Conversational Co-Speech Gesture Generation via Modeling Dialog Intention, Emotion, and Context with Diffusion Models
Haiwei Xue, Sicheng Yang, Zhensong Zhang, Zhiyong Wu, Minglei Li, Zonghong Dai, Helen Meng
In ICASSP 2024.
[IJCV] Towards A Unified Network for Robust Monocular Depth Estimation: Network Architecture, Training Strategy and Dataset
Mochu Xiang, Yuchao Dai, Feiyu Zhang, Jiawei Shi, Xinyu Tian, Zhensong Zhang
In IJCV.
[ICMI] The DiffuseStyleGesture+ Entry to the GENEA Challenge 2023
Sicheng Yang, Haiwei Xue, Zhensong Zhang, Minglei Li, Zhiyong Wu, Xiaofei Wu, Songcen Xu, Zonghong Dai
In ICMI 2023.
[Code]
[Reproducibility Award]
[ACM MM] UnifiedGesture: A Unified Gesture Synthesis Model for Multiple Skeletons
Sicheng Yang, Zilin Wang, Zhiyong Wu, Minglei Li, Zhensong Zhang, Qiaochu Huang, Lei Hao, Songcen Xu, Xiaofei Wu, Changpeng Yang, Zonghong Dai
In ACM MM 2023.
[Code]
[Demo]
[presentation]
[IJCAI] DiffuseStyleGesture: Stylized Audio-Driven Co-Speech Gesture Generation with Diffusion Models
Sicheng Yang, Zhiyong Wu, Minglei Li, Zhensong Zhang, Lei Hao, Weihong Bao, Ming Cheng, Long Xiao
In IJCAI 2023.
[Code]
[Demo]
[CVPR Highlight] QPGesture: Quantization-Based and Phase-Guided Motion Matching for Natural Speech-Driven Gesture Generation
Sicheng Yang, Zhiyong Wu, Minglei Li, Zhensong Zhang, Lei Hao, Weihong Bao, Haolin Zhuang
In CVPR 2023.
[Code]
[Demo]
[presentation]
[ECCV Oral] CLIFF: Carrying Location Information in Full Frames into Human Pose and Shape Estimation
Zhihao Li, Jianzhuang Liu, Zhensong Zhang, Songcen Xu, Youliang Yan
In ECCV 2022.
[Code]
[TIP] Multi-View Video Synopsis via Simultaneous Object-Shifting and View-Switching Optimization
Zhensong Zhang, Yongwei Nie, Hanqiu Sun, Qing Zhang, Qiuxia Lai, Guiqing Li, Mingyu Xiao
In IEEE Transactions Image Processing, 2020.
[TIP] Video Synopsis Incorporating Object Speed and Size Changes
Yongwei Nie, Zhenkai Li, Zhensong Zhang, Qing Zhang, Tiezheng Ma, Hanqiu Sun
In IEEE Transactions Image Processing, 2020.
[TVCG] Effective Video Stabilization via Joint Trajectory Smoothing and Frame Warping
Tiezheng Ma, Yongwei Nie, Qing Zhang, Zhensong Zhang, Hanqiu Sun, Guiqing Li
In IEEE Transactions on Visualization and Computer Graphics, 2020.
[TCSVT] Interactive Contour Extraction via Sketch-Alike Dense-Validation Optimization
Yongwei Nie, Xu Cao, Ping Li, Qing Zhang, Zhensong Zhang, Guiqing Li, Hanqiu Sun
In IEEE Transactions on Circuits Systems for Video Technology, 2020
[PR] Video super-resolution via pre-frame constrained and deep-feature enhanced sparse reconstruction
Qiuxia Lai, Yongwei Nie, Hanqiu Sun, Qiang Xu, Zhensong Zhang, Mingyu Xiao
In Pattern Recognition, 2020
[TIP] Dynamic Video Stitching via Shakiness Removing
Yongwei Nie, Tan Su, Zhensong Zhang, Hanqiu Sun and Guiqing Li
In IEEE Transactions on Image Processing, 2018.
[Code] [Demo]
[SIGGRAPH Asia] Multi-video Object Synopsis Integrating Optimal View Switching
Zhensong Zhang, Yongwei Nie, Hanqiu Sun, Qiuxia Lai and Guiqing Li
In SIGGRAPH Asia 2017 Technical Briefs, Bangkok, Thailand, 2017.
[TVCG] Homography Propagation and Optimization for
Wide-Baseline Street Image Interpolation
Yongwei Nie, Zhensong Zhang, Hanqiu Sun, Tan Su and Guiqing Li
In IEEE Transactions on Visualization and Computer Graphics, 2017.
[Code] [Demo]
[SIGGRAPH Asia] Video Stitching for Handheld Inputs via Combined Video Stabilization
Tan Su, Yongwei Nie, Zhensong Zhang, Hanqiu Sun and Guiqing Li
In SIGGRAPH Asia 2016 Technical Briefs, Macau, China, 2016.
Winner, ECCV 2022 RVC monoculer depth estimation prediction challenge
Paper Lists
Computer Graphics Papers, ECCV Papers, CVPR/ICCV Papers, NeurIPS Papers