I am a forth-year Computer Science Ph.D. Student at the University of Southern California (USC), advised by Prof. Gaurav Sukhatme.
Previously, I received a master's degree in Computer Science at USC. I completed my bachelor's degree in Computer Science at Harbin Institute of Technology.
DSBench was selected as an evaluation benchmark for OpenAI's most advanced LLM model o3 and their first agent ChatGPT Agent to evaluate their reasoning and coding abilities.
My research aims to develop intelligent agents that can robustly and safely perform complex tasks in unstructured environments by autonomously adapting to new situations through unsupervised and continual learning.
To achieve this, my research spans three interconnected areas: (1) reinforcement learning, (2) robot learning, and (3) foundation models.
TL;DR: A novel framework for multi-robot coordination using large language models to enable compositional coordination strategies for complex multi-robot tasks.
TL;DR: FixPO combines the guarantees of trust region methods with the computational efficiency of proximal methods, enforcing trust regions via flexible KL penalization.
TL;DR: Comprehensive benchmark for evaluating data science agents with realistic tasks, bridging the gap between simplified settings and real-world data science applications.
π Selected as evaluation benchmark for OpenAI's o3 model and ChatGPT Agent. Source: OpenAI Blog
TL;DR: Investigates how agent feedback frequency affects team performance in human-AI collaborative scenarios, focusing on communication support optimization.
TL;DR: Focuses on leveraging LLMs to enable agents with real-time adaptation capabilities in collaborative scenarios, establishing new benchmarks for embodied AI.
TL;DR: FixPO combines the guarantees of trust region methods with the computational efficiency of proximal methods, enforcing trust regions via flexible KL penalization.
TL;DR: Pioneering work on decentralized quadrotor swarm control using deep RL with successful sim-to-real transfer, including the open-source quad-swarm-rl simulator.
TL;DR: A novel framework for multi-robot coordination using large language models to enable compositional coordination strategies for complex multi-robot tasks.
TL;DR: Comprehensive benchmark for evaluating data science agents with realistic tasks, bridging the gap between simplified settings and real-world data science applications.
π Selected as evaluation benchmark for OpenAI's o3 model and ChatGPT Agent. Source: OpenAI Blog
TL;DR: Investigates how agent feedback frequency affects team performance in human-AI collaborative scenarios, focusing on communication support optimization.
TL;DR: Focuses on leveraging LLMs to enable agents with real-time adaptation capabilities in collaborative scenarios, establishing new benchmarks for embodied AI.