Ph.D. Candidate
School of Science and Engineering (SSE), CUHK-Shenzhen
Email | Google Scholar | GitHub | Twitter
I am a Ph.D. candidate at The Chinese University of Hong Kong, Shenzhen (CUHK-Shenzhen), advised by Prof. Benyou Wang. I've had the privilege of working with amazing teams at Alibaba (Qwen Team), Microsoft Research Asia (MSRA), and Tencent.
Research: My research focuses on developing intelligent agents capable of complex reasoning and self-improvement. I pioneer agentic frameworks that leverage reinforcement learning (RL) for tool-integrated tasks in my work on CoRT (NeurIPS'25) and STORM. To enable self-improvement, my SCRIT framework (COLM'25) introduces a self-evolving critique model—a form of generative reward model—for scalable oversight without external supervision.
My research program also establishes the foundations for these advanced models. I have designed novel instruction tuning frameworks—MathScale (ICML'24), GLAN (TMLR'25), and ALAN (ACL'25)—to generate high-quality training data at scale. Additionally, my work on DPTDR (COLING'22) ensures these agents can access information efficiently, achieving top rankings on competitive benchmarks.
I am actively seeking full-time research or engineering roles starting around July 2026. Feel free to reach out!