Hang Yu | Tongji University

About Me

I am a first-year master’s student at Tongji University supervised by Prof.Junqiao Zhao. I am interested in reinforcement learning, LLM reasoning and embodied intelligence, with a focus on improving agent generalization in dynamic environments. Ultimately, I aim to build generalizable action models to enable intelligent and versatile robot behaviors.

Research Interests

Reinforcement Learning: model-based RL, offline RL
Large Language Model(LLM): reasoning, RLHF
Emboddied AI: Visual Prompt, DAgger

Experience

Spirit AI Research Intern 「Aug. 2025 – Present」
- Supervised by Prof. Yang Gao and Junyuan Xie
- Involved in algorithm design and optimization for Vision-Language-Action (VLA) models, with deployment on real-world robotic systems.
- Currently working on Visual Prompt and Dagger research.
Graz University of Technology Exchange Researcher (Austria) 「March. 2025 – July. 2024」
- funded by Erasmus+, work with Prof. Eduardo Veas on RL application
TAL MathGPT Group research intern 「March. 2023 – July. 2023」
- Contributed to the RLHF pipeline design, including reward modeling, and PPO/DPO alignment for TAL’s MathGPT, a math-focused large model used in education scenarios.
Master’s Degree 「Sep. 2024 – Present」
- Dept. of Computer Science and Technology, Tongji University
Bachelor’s Degree 「Sep. 2020 – Jul. 2024 」
- Dept. of Computer Science and Technology, Tongji University
- Overall Ranking College No.1/113
- Outstanding Graduate of Shanghai 2024
- First Prize Tongji University Excellent Student Scholarship, 2020 & 2021 & 2022 & 2023
- First Prize in the National Finals of the 17th Undergraduate Intelligent Vehicle Competition

Publications

ASTRO: Adaptive Stitching via Dynamics-Guided Trajectory Rollouts

Hang Yu, Di Zhang, Qiwei Du, Yanping Zhao, Hai Zhang, Guang Chen, Junqiao Zhao†, Eduardo E. Veas†

under preview

Knowledge or Reasoning? A Close Look at How LLMs Think Across Domains

Juncheng Wu*, Sheng Liu*, Haoqin Tu*, Hang Yu*, Xiaoke Huang, Cihang Xie, Yuyin Zhou†

PDF Code Project Page under preview

NeurIPS

Focus On What Matters: Separated Models For Visual-Based RL Generalization

Di Zhang, Bowen Lv, Hai Zhang, Feifan Yang, Junqiao Zhao†, Hang Yu, Chang Huang, Hongtu Zhou, Chen Ye, Changjun Jiang

Annual Conference on Neural Information Processing Systems (NeurIPS) 2024

PDF Code

NeurIPS

How to Fine-tune the Model: Unified Model Shift and Model Bias Policy Optimization

Hai Zhang, Hang Yu, Junqiao Zhao†, Di Zhang, Chang Huang, Hongtu Zhou, Xiao Zhang, Chen Ye

Annual Conference on Neural Information Processing Systems (NeurIPS) 2023

PDF Code