profile photo

Zhehui Huang

I am a forth-year Computer Science Ph.D. Student at the University of Southern California (USC), advised by Prof. Gaurav Sukhatme.

Previously, I received a master's degree in Computer Science at USC. I completed my bachelor's degree in Computer Science at Harbin Institute of Technology.

Email  |  πŸ“„ CV  |  πŸ“ Bio  |  πŸŽ“ Google Scholar
Twitter  |  Github  |  Linkedin  |  Blog


News

[July 2025] DSBench was selected as an evaluation benchmark for OpenAI's most advanced LLM model o3 and their first agent ChatGPT Agent to evaluate their reasoning and coding abilities.
[June 2025] Successfully co-organized Resource Constrained Robotics Workshop at RSS 2025.
[Jan 2025] DSBench got accepted to ICLR 2025.
[Dec 2024] HRT-ML got accepted to WMAC @ AAAI 2025.
[Dec 2024] MonTA got accepted to LM4Plan @ AAAI 2025.
[Sept 2024] LLMs for Robot Routing got accepted to ISRR 2024.
[May 2024] Started internship at Tencent America.
[Mar 2024] Received AI Research Grant from Cohere.
[Jan 2024] Two papers got accepted to ICRA 2024. [Paper #1] and [Paper #2]
[Nov 2023] Gave a talk at USC Robotics Seminar (URoS).
[Apr 2023] QuadSwarm got accepted to ICRA 2023 Workshop: The Role of Robotics Simulators for Unmanned Aerial Vehicles.
[Mar 2023] Passed qualifying exam.
[Dec 2022] Received $70,000 AWS cloud credit for research.
[May 2022] Started internship at NVIDIA.
[Sept 2021] Decentralized Control of Quadrotor Swarms got accepted to CoRL 2021.
[Sept 2021] Received $43,000 AWS cloud credit for research.
[Aug 2021] Started Ph.D. at USC.

Research

My research aims to develop intelligent agents that can robustly and safely perform complex tasks in unstructured environments by autonomously adapting to new situations through unsupervised and continual learning. To achieve this, my research spans three interconnected areas: (1) reinforcement learning, (2) robot learning, and (3) foundation models.

Reinforcement Learning (RL):

Robot Learning: Foundation Models:

Publications

Compositional Coordination for Multi-Robot Teams with Large Language Models

In submission

TL;DR: A novel framework for multi-robot coordination using large language models to enable compositional coordination strategies for complex multi-robot tasks.

Guaranteed Trust Region Optimization via Two-Phase KL Penalization

TL;DR: FixPO combines the guarantees of trust region methods with the computational efficiency of proximal methods, enforcing trust regions via flexible KL penalization.

Sample Factory: Egocentric 3D Control from Pixels at 100000 FPS with Asynchronous Reinforcement Learning

TL;DR: The fastest open-source single-machine RL implementation, achieving 100,000 FPS for significantly accelerated experimentation and better performance through massive sample collection.

DSBench: How Far Are Data Science Agents to Becoming Data Science Experts?

TL;DR: Comprehensive benchmark for evaluating data science agents with realistic tasks, bridging the gap between simplified settings and real-world data science applications.

πŸ† Selected as evaluation benchmark for OpenAI's o3 model and ChatGPT Agent. Source: OpenAI Blog

Effect of Adaptive Communication Support on Human-AI Collaboration

TL;DR: Investigates how agent feedback frequency affects team performance in human-AI collaborative scenarios, focusing on communication support optimization.

Benchmark Real-time Adaptation and Communication Capabilities of Embodied Agent in Collaborative Scenarios

TL;DR: Focuses on leveraging LLMs to enable agents with real-time adaptation capabilities in collaborative scenarios, establishing new benchmarks for embodied AI.



Modified version of template from this and this.