Shaoting Zhu (朱少廷)

Hey, I'm Shaoting, a second-year PhD student at Institute for Interdisciplinary Information Sciences (IIIS), Tsinghua University, majoring in Computer Science. I am very fortunate to be advised by Prof. Hang Zhao in MARS Lab. Before that, I received my bachelor's degree from Zhejiang University. I was advised by Prof. Yong Liu in APRIL Lab from CSE.

I like to make handicrafts and listen to music in my spare time. Also, I love doing outdoor activities, especially traveling and road trip.

Email  /  Scholar  /  Twitter  /  Github

profile photo

Recent News

  • We publish and open-source Project Instinct! Check the project page for more details.
  • Two papers MoELoco and RoboEngine accepted by IROS 2025!
  • One paper VR-Robo accepted by RA-L 2025!
  • Two papers SARO and RRW accepted by ICRA 2025!
  • I'm currently pursuing my PhD at Tsinghua University! (since Fall 2024)

Research

My research interest includes Robotics, Computer Vision and Artificial Intelligence. (*indicates equal contribution)

TTT-Parkour: Rapid Test-Time Training for Perceptive Robot Parkour
Shaoting Zhu*, Baijun Ye*, Jiaxuan Wang†, Jiakang Chen†, Ziwen Zhuang, Linzhan Mou, Runhan Huang, Hang Zhao
Arxiv, 2026
project page / video / arXiv

We propose a real-to-sim-to-real framework that leverages rapid test-time training (TTT) on novel terrains, significantly enhancing the robot's capability to traverse extremely difficult geometries.

Hiking in the Wild: A Scalable Perceptive Parkour Framework for Humanoids
Shaoting Zhu*, Ziwen Zhuang*, Mengjie Zhao, Kun-Ying Lee, Hang Zhao
Arxiv, 2026
project page / video / arXiv / code

We present a scalable, end-to-end perceptive parkour framework designed for robust humanoid hiking. Our policy enables robust traversal of complex terrains at speeds up to 2.5 m/s.

Deep Whole-body Parkour
Ziwen Zhuang, Shaoting Zhu, Mengjie Zhao, Hang Zhao
Arxiv, 2026
project page / video / arXiv / code

We present a framework where exteroceptive sensing is integrated into whole-body motion tracking, permitting a humanoid to perform highly dynamic, non-locomotion tasks on uneven terrain.

VR-Robo: A Real-to-Sim-to-Real Framework for Visual Robot Navigation and Locomotion
Shaoting Zhu*, Linzhan Mou*, Derun Li, Baijun Ye, Runhan Huang, Hang Zhao
RA-L, 2025
project page / video / arXiv / code

VR-Robo introduces a digital twin framework using 3D Gaussian Splatting for photorealistic simulation, enabling RGB-based sim-to-real transfer for robot navigation and locomotion.

MoELoco: Mixture of Experts for Multitask Locomotion
Runhan Huang*, Shaoting Zhu*, Yilun Du, Hang Zhao
IROS, 2025
project page / video / arXiv

MoELoco introduces a multitask locomotion framework that employs a mixture-of-experts strategy to enhance reinforcement learning across diverse tasks while leveraging compositionality to generate new skills.

RoboEngine: Plug-and-Play Robot Data Augmentation with Semantic Robot Segmentation and Background Generation
Chengbo Yuan*, Suraj Joshi*, Shaoting Zhu*, Hang Su, Hang Zhao, Yang Gao
IROS, 2025
project page / video / arXiv / code

RoboEngine is the first plug-and-play visual robot data augmentation toolkit. Users can effortlessly generate physics-aware robot scenes with few lines code. This enable training only in one scenes and visual generalizing to almost arbitrary scenes.

Robust Robot Walker: Learning Agile Locomotion over Tiny Traps
Shaoting Zhu, Runhan Huang, Linzhan Mou, Hang Zhao
ICRA, 2025
project page / video / arXiv / code

We propose a proprioception-only, two-stage training framework with goal command and a dedicated tiny trap benchmark, enabling quadruped robots to robustly traverse small obstacles.

SARO: Space-Aware Robot System for Terrain Crossing via Vision-Language Model
Shaoting Zhu*, Derun Li*, Linzhan Mou, Yong Liu, Ningyi Xu, Hang Zhao
ICRA, 2025
project page / video / arXiv

SARO is an innovative system composed of a high-level reasoning module, a closed-loop sub-task execution module, and a low-level control policy. It enables the robot to navigate across 3D terrains and reach the goal position.

T5-ARC: Test-Time Training for Transductive Transformer Models in ARC-AGI Challenge
Shaoting Zhu*, Shuangyue Geng*, Un Lok Chen*
Course Project, Advanced Machine Learning by Jie Tang
paper

We focus on Test-Time Training (TTT) for transductive models and develop our pipeline following the SOTA methods, which consists of three steps: Base Model Training, TTT and Active Inference.

UniFace++: Revisiting a Unified Framework for Face Reenactment and Swapping via 3D Priors
Chao Xu, Yijie Qian, Shaoting Zhu, Baigui Sun, Jian Zhao, Yong Liu, Xuelong Li
IJCV, 2025
paper

UniFace++ combines the advantages of each, ie, stability of reconstruction training from reenactment, simplicity and effectiveness of the target-oriented processing from swapping, and redefining both as target-oriented reconstruction tasks.

Multimodal-driven talking face generation via a unified diffusion-based generator
Chao Xu, Shaoting Zhu, Junwei Zhu, Tianxin Huang, Jiangning Zhang, Ying Tai, Yong Liu
arXiv, 2023
paper

Given a textured face as the source and the rendered face projected from the desired 3DMM coefficients as the target, our proposed Texture-Geometry-aware Diffusion Model decomposes the complex transfer problem into multi-conditional denoising process.

Miscellanea

Academic Service

Reviewer, IROS 2025
Reviewer, RA-M
Reviewer, RA-L

Teaching

Teaching Assistant, Advances in Autonomous Driving and Intelligent Vehicles, Fall 2024

Education

2020-2024: B.S. in Zhejiang University, Hangzhou, China.
Honored with Chu Kochen Scholarship in 2023.
2024-Now: Ph.D. in Tsinghua University, Beijing, China.

Template borrowed from jonbarron.