I am a a second-year Ph.D. student in College of Computing and Data Science (CCDS) at Nanyang Technological University (NTU). I am co-advised by Prof. Aixin Sun and Prof. Yixin Cao. Before that, I received my Bachelor of Computing Degree (with Honours) from National University of Singapore (NUS) in 2021, and worked as a project officer in S-lab, NTU until 2023.
My research interest includes retrieval augmented generation (RAG) and long document understanding. I have published papers at top international conferences ACL and NeurIPS with total google scholar citations 300+.
🔥 News
- 2025.07: 🎉🎉 Gave tutorial in SIGIR 2025 in Padova/Italy!
- 2024.09: 🎉🎉 One paper accepted to NeurIPS 2024 as spotlight!
- 2024.05: 🎉🎉 Two papers accepted to ACL 2024 (Findings)!
- 2023.05: 🎉🎉 One paper accepted to ACL 2023 (Findings)!
📒 Preprint
- Long Context vs. RAG for LLMs: An Evaluation and Revisits, Xinze Li, Yixin Cao, Yubo Ma, Aixin Sun. 2024.
- Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks, Yixin Cao, Shibo Hong, Xinze Li, Jiahao Ying, Yubo Ma, Haiyuan Liang, Yantao Liu, Zijun Yao, Xiaozhi Wang, Dan Huang, Wenxuan Zhang, Lifu Huang, Muhao Chen, Lei Hou, Qianru Sun, Xingjun Ma, Zuxuan Wu, Min-Yen Kan, David Lo, Qi Zhang, Heng Ji, Jing Jiang, Juanzi Li, Aixin Sun, Xuanjing Huang, Tat-Seng Chua, Yu-Gang Jiang. 2025.
📝 Publications
- Towards Verifiable Generation: A Benchmark for Knowledge-aware Language Model Attribution, Xinze Li, Yixin Cao, Liangming Pan, Yubo Ma, Aixin Sun. ACL 2024 (Findings)
- Take a Break in the Middle: Investigating Subgoals towards Hierarchical Script Generation, Xinze Li, Yixin Cao, Liangming Pan, Yubo Ma, Aixin Sun. ACL 2023 (Findings)
- MMLongBench-Doc: Benchmarking Long-context Document Understanding with Visualizations, Yubo Ma, Yuhang Zang, Liangyu Chen, Meiqi Chen, Yizhu Jiao, Xinze Li, Xinyuan Lu, Ziyu Liu, Yan Ma, Xiaoyi Dong, Pan Zhang, Liangming Pan, Yu-Gang Jiang, Jiaqi Wang, Yixin Cao, Aixin Sun. NeurIPS 2024 spotlight (dataset track)
- Data Augmentation using LLMs: Data Perspectives, Learning Paradigms and Challenges, Bosheng Ding, Chengwei Qin, Ruochen Zhao, Tianze Luo, Xinze Li, Guizhen Chen, Wenhan Xia, Junjie Hu, Anh Tuan Luu, Shafiq Joty. ACL 2024 (Findings)
- MMEKG: Multi-modal Event Knowledge Graph towards Universal Representation across Modalities, Yubo Ma, Zehao Wang, Mukai Li, Yixin Cao, Meiqi Chen, Xinze Li, Wenqi Sun, Kunquan Deng, Kun Wang, Aixin Sun, Jing Shao. ACL 2022 (Demo Track)
📖 Educations
- 2023.08 - now, Ph.D student, College of Computing and Data Science (CCDS), Nanyang Technological University (NTU).
- 2017.08 - 2021.06, Bachelor of Computing, School of Computing (SoC), National University of Singapore (NUS)
💬 Invited Talks
- 2025.07, Tutorial [Long Context vs. RAG: Strategies for Processing Long Documents in LLMs] gave in SIGIR 2025, in Padova Italy.
💻 Internships
- 2025.07 - now, Research Intern, Shanghai AI Labratory, China.
- 2021.11 - 2023.07 Project Officer, S-Lab, NTU, Singapore
📚 Academic Services
- Conference Reviewer: NeurIPS 2025, ACL ARR (ACL 2024, EMNLP 2024, NAACL 2024, ACL 2025, EMNLP 2025)
- Journal Reviewer: TOIS