I am currently a third-year Ph.D. student at NExT++ Research Center, advised by Prof. Tat-Seng Chua in School of Computing at National University of Singapore, after I obtained my M.S. and Bachelor degrees from Wuhan University.
My research revolves around Multimodal Understanding and Multimodal Generation, with a major focus on Multimodal Large Language Models and Scene Graph Modeling. Also, certain part of my intearest has been paid on Natural Language Processing.
Some of my representive work:
NExT-GPT:
The first unified any-to-any multimodal LLM, capable of understanding and generating across any modality or combination of modalities (e.g., text, image, video, audio). [PDF] [Github] [Huggingface] [Video] (ICML'24 Oral, selected as a Most Influential Paper by Paper Digest, WAIC Youth Outstanding Paper Award, |
|
Any2Caption:
A SoTA framework for controllable video generation from any condition by being the first to leverage MLLMs to interpret diverse inputs into dense, structured captions. [PDF] [Github] [Huggingface] [Video] (Preprint, 2025) |
|
![]() |
Setok:
The first to propose a general dynamic semantic-equivalent vision tokenizer, fundamentally enhancing the performance bottlenecks of existing MLLMs. [PDF] [Github] (ICLR'25) |
![]() |
USG:
The first to propose a Universal Scene Graph representation framework that unifies structured semantic scene graphs across modalities including images, text, videos, and 3D. [PDF] [Github] (CVPR'25 Highlight) |
|
2025 |
|
2024 |
|
2023 |
|
2022 |
|
2021 |
Conference Reviewer |
NeurIPS-23/24, ICLR-24/25, ICML-24/25, CVPR-24/25, ACM MM-23/24, IJCAI-23/24, AAAI-24/25, ACL-23/24, WSDM-23, |
Journal Reviewer |
TOMM, IEEE/ACM TALLIP, IPM, KBS, Neurocomputing |
|
Dec. 2024 - Now                    Kauishou, Remote                                                           VGI Research Intern                                                           Advisor: Weicai Ye, Xintao Wang |
![]() |
Nov. 2023 - Jun. 2024         Kunlun Skywork AI, Singapore                                                           2050 Research Intern                                                           Advisor: Shuicheng Yan, Director |
![]() |
Jul. 2019 - Aug. 2019          China Merchants Bank, Wuhan, China                                                           Information Technology Department Intern                                                           Advisor: Hua Pan, General Manager |
![]() |
Jul. 2018 - Nov. 2018          YITU, ShangHai, China                                                           Solution Engineer Intern                                                           Advisor: Ze Deng |
![]() |
|
|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |