Research
|
My research interests lie in (multimodal) large language models and their applications in digital/embodied agents. I hope to scale up the performance of LLMs solving complex tasks through enhancing their reasoning ability and interaction skills.
I am also dedicated to enhancing my skills in Machine Learning Systems.
|
Publication
|
|
Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction
Yiheng Xu*,
Zekun Wang*,
Junli Wang*,
Dunjie Lu,
Tianbao Xie,
Amrita Saha,
Doyen Sahoo,
Tao Yu,
Caiming Xiong
|
|
AgentTrek: Agent Trajectory Synthesis via Guiding Replay with Web Tutorials
Yiheng Xu*,
Dunjie Lu*,
Zhennan Shen*,
Junli Wang,
Zekun Wang,
Yuchen Mao,
Caiming Xiong,
Tao Yu,
ICLR 2025
|
Internships
|
2024.11 - now: Qwen Team, Alibaba Group.
|
The source code is stolen from Jon Barron. Thanks for his sharing! 🙏🏻
|
|