Hanchen Cui

I am a Ph.D. student in computer science at University of Minnesota Twin Cities, working with Prof. Karthik Desingh. I received my master's and bachelor's degree from Shanghai Jiao Tong University and Xidian University, respectively. Previously, I'm a researcher at Shanghai Qi Zhi Institute, where I work closely with Prof. Yang Gao at IIIS, Tsinghua University. Afterwards, I have a short visit at NUS, collaborating with Prof. Lin Shao

My research interest lies in robot learning and my long-term goal is to build a generalized robot that can accomplish long-horizon and complex tasks. To be mroe specific, I work on vision-language-action models, world models, representation learning, and legged locomotion.

Email: hanchen.cui147[at]gmail.com

I am actively seeking internship opportunities for Summer 2025. If you're interested in collaborating, feel free to reach out!

Email  /  Scholar

profile photo

Publications

(representative papers are highlighted)

clean-usnob Fine-Tuning Hard-to-Simulate Objectives for Quadruped Locomotion: A Case Study on Total Power Saving
Ruiqian Nai, Jiacheng You, Liu Cao, Hanchen Cui, Shiyuan Zhang, Huazhe Xu, Yang Gao
ICRA 2025
project page / pdf

We propose a data-driven framework for fine-tuning locomotion policies, targeting these hard-to-simulate objectives. Our framework leverages real-world data to model these objectives and incorporates the learned model into simulation for policy improvement.

clean-usnob A Universal World Model Learned from Large Scale and Diverse Videos
Hanchen Cui, Yang Gao
NeurIPS 2023, Foundation Models for Decision Making Workshop
pdf / poster

We propose a generalizable world model pre-tained by large scale and diverse video dataset and then fine-tune the world model to obtain an accurate dynamic function in a sample-efficient manner.

clean-usnob A Universal Semantic-Geometric Representation for Robotic Manipulation
Tong Zhang*, Yingdong Hu*, Hanchen Cui,Hang Zhao, Yang Gao
CoRL 2023
project page / arXiv / code

We present Semantic-Geometric Representation (SGR), a universal perception module for robotics that leverages the rich semantic information of large-scale pre-trained 2D models and inherits the merits of 3D spatial reasoning.

Research projects

A fast-speed vision-language intelligent manipulation system

(1) Build a fast-speed vision-language manipulation deployment system to accomplish a variety of tasks based on Franka Panda Robot.

(2) Enable robust remote control with low latency.

(3) Fine-tune the vision-language manipulation model(CLIPORT), achieving 90% success rate in real-world experiments.

Legged robots learn from physical world

(1) Establish a sim2real locomotion deployment codebase for legged robots.

(2) Build a real-world learning framework which collects data in real world and trains the policy model in a remote server.

(3) Design and deploy a simple yet effective reward function, where all the reward terms are acquired from the robot itself without extra sensors.

Visual planning for legged robots

(1) Leverage the strong image understanding and high-level planning ability of large multimodal models(like GPT4-Vision), enabling legged robots to accomplish multi-step and complex tasks, such as sending packages, and going out to find an object.

foundation policy model Foundation policy model for robotic manipulation (on-going)

(1) Build a foundation policy model from large-scale expert dataset like Open X-Embodiment.

(2) Extract and label action squences using large vision-language models.

(3) Training and inference use action-aware sequence data instead of MDP.


Modified version of template from here.