I am currently a 3rd year Ph.D. student at Gaoling School of Artificial Intelligence (GSAI) in Renmin University of China, supervised by Prof. Rui Yan. I am dedicated to creating more powerful foundation language models.
News Within a Year
-
2025.01: We propose Autonomy-of-Experts Models, a new MoE paradigm.
-
2025.01: One paper is accepted by WWW 2025 oral.
-
2025.01: One paper is accepted by NAACL 2025 main (short paper).
-
2024.11: We propose Cog Attention that enables negative attention weights for enhanced expressiveness.
-
2024.11: I am awarded the 2024 CIE-Tencent Doctoral Student Research Incentive Program (HunYuan Large Language Model Special Project).
-
2024.09: One paper is accepted by NeurIPS 2024.
-
2024.09: Two papers are accepted by EMNLP 2024 main conference.
-
2024.09: We propose that the development of context copying capacities in LLMs is a special grokking.
-
2024.05: I am awarded the 2024 CCF-Tencent Rhino-Bird Elite Talent Program, mentored by Ruobing Xie.
-
2024.05: Two papers are accepted by ACL 2024 main conference. One paper is accepted by ACL 2024 findings.
Publications (First Author and First Co-author)
-
PEAR: Position-Embedding-Agnostic Attention Re-weighting Enhances Retrieval-Augmented Generation with Zero Inference Overhead. Ang Lv*, Tao Tan*, Yining Qian*, Hongzhan Lin, Songhao Wu, Yongbo Wang, Feng Wang, Jingtong Wu, Xin Lu, Rui Yan. The Web Conference (WWW’25 oral) Link
-
Language Models “Grok” to Copy. Ang Lv, Ruobing Xie, Xingwu Sun, Zhanhui Kang, Rui Yan. Proceedings of the 2025 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL’25 short). Link
-
An Analysis and Mitigation of the Reversal Curse. Ang Lv*, Kaiyi Zhang*, Shufang Xie, Quan Tu, Yuhan Chen, Ji-Rong Wen, Rui Yan. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP’24). Link
-
Mixture-of-Modules: Reinventing Transformers as Dynamic Assemblies of Modules. Ang Lv*, Zhuocheng Gong*, Jian Guan, Junxi Yan, Wei Wu, Huishuai Zhang, Minlie Huang, Dongyan Zhao, Rui Yan. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP’24). Link
-
Mixture of In-Context Experts Enhance LLMs’ Long Context Awareness. Ang Lv*, Hongzhan Lin*, Yuhan Chen*, Chen Zhu, Yang Song, Hengshu Zhu, Rui Yan. Thirty-eighth Annual Conference on Neural Information Processing Systems (NeurIPS’ 24).
-
Fortify the Shortest Stave in Attention: Enhancing Context Awareness of Large Language Models for Effective Tool Use. Ang Lv*, Yuhan Chen*, Ting-En Lin, Changyu Chen, Yuchuan Wu, Fei Huang, Yongbin Li, Rui Yan. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL’24). Link
-
Batch-ICL: Effective, Efficient, and Order-Agnostic In-Context Learning. Ang Lv*, Kaiyi Zhang*, Yuhan Chen, Hansen Ha, Tao Xu, Rui Yan. Findings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL ‘24 Findings). Link
-
Re-creation of Creations: A New Paradigm for Lyric-to-Melody Generation. Ang Lv, Xu Tan, Tao Qin, Tie-Yan Liu, Rui Yan. The 33th International Joint Conference on Artificial Intelligence (IJCAI’24). Link
-
DialoGPS: Dialogue Path Sampling in Continuous Semantic Space for Data Augmentation in Multi-Turn Conversations. Ang Lv*, Jinpeng Li*, Yuhan Chen, Gao Xing, Ji Zhang, Rui Yan. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL’23). Link
-
Envisioning Future from the Past: Hierarchical Duality Learning for Multi-Turn Dialogue Generation. Ang Lv*, Jinpeng Li*, Shufang Xie, Rui Yan. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL’23 oral). Link
-
Target-Side Input Augmentation for Sequence to Sequence Generation. Ang Lv*, Shufang Xie*, Yingce Xia, Lijun Wu, Tao Qin, Tie-Yan Liu, Rui Yan. The 10th International Conference on Learning Representations (ICLR’22). Link
Other Publications
- Masked Thought: Simply Masking Partial Reasoning Steps Can Improve Mathematical Reasoning Learning of Language Models. Changyu Chen, Xiting Wang, Ting-En Lin, Ang Lv, Yuchuan Wu, Xin Gao, Ji-Rong Wen, Rui Yan, Yongbin Li. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL’24). Link
Preprint Papers
-
Autonomy-of-Experts Models. Ang Lv, Ruobing Xie, Yining Qian, Songhao Wu, Xingwu Sun, Zhanhui Kang, Di Wang, Rui Yan. Arxiv, Link
-
More Expressive Attention with Negative Weights. Ang Lv, Ruobing Xie, Shuaipeng Li, Jiayi Liao, Xingwu Sun, Zhanhui Kang, Di Wang, Rui Yan. Arxiv, Link
-
Interpreting Key Mechanisms of Factual Recall in Transformer-Based Language Models. Ang Lv*, Yuhan Chen*, Kaiyi Zhang*, Yulong Wang, Lifeng Liu, Ji-Rong Wen, Jian Xie, Rui Yan. Arxiv. Link
Honors and Awards
- CIE-Tencent Doctoral Student Research Incentive Program (HunYuan Large Language Model Special Project), 1 of 17 selected individuals nationwide(中国电子学会-腾讯博士生科研激励计划 混元大模型专项,全国17人)
- CCF-Tencent Rhino-Bird Elite Talent Program, 2024, 1 of 50 selected individuals nationwide(中国计算机学会-腾讯犀牛鸟精英人才计划,全国50人)
- Supported by the Outstanding Innovative Talents Cultivation Funded Programs 2023 of Renmin University of China, 2023 (中国人民大学拔尖创新人才)
Academic Services
- Conference Reviewer: ACL (Area Chair), EMNLP, ICLR, WWW, KDD
- Journal Reviewer: ACM TIST
Internships
- 2022.09 - 2023.03, Alibaba Damo Academy, Hangzhou.
- 2023.03 - 2023.09, Microsoft Research, Machine Learning Area, mentored by Xu Tan. Our collaborative efforts are dedicated to the Muzic project, which currently boasts 4k stars on GitHub.
- 2023.09 - 2024.05, Tongyi Lab, Alibaba, Beijing.
- 2024.05 - now, Tencent, Beijing.