I am currently a second year Ph.D. student at Gaoling School of Artificial Intelligence (GSAI) in Renmin University of China, supervised by Prof. Rui Yan. I focus on long-context ability and mechanistic interpretability of LLMs. I am dedicated to creating more powerful foundation models.
News Within a Year
-
2024.09: One paper is accepted by NeurIPS 2024.
-
2024.09: Two papers are accepted by EMNLP 2024 main conference.
-
2024.09: We propose that the development of context copying capacities in LLMs is a special grokking.
-
2024.05: I am successfully selected by 2024 CCF-Tencent Rhino-Bird Elite Talent Program, mentored by Ruobing Xie.
-
2024.05: Two papers are accepted by ACL 2024 main conference. One paper is accepted by ACL 2024 findings.
-
2024.04: One paper is accepted by IJCAI 2024.
-
2024.03: We thoroughly studied the mechanisms of factual recall in Transformer-based language models, and I hope you will find the exciting findings engaging!
Publications (First Author and First Co-author)
-
Ang Lv*, Kaiyi Zhang*, Shufang Xie, Quan Tu, Yuhan Chen, Ji-Rong Wen, Rui Yan. An Analysis and Mitigation of the Reversal Curse, Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP’24). Link
-
Ang Lv*, Zhuocheng Gong*, Jian Guan, Junxi Yan, Wei Wu, Huishuai Zhang, Minlie Huang, Dongyan Zhao, Rui Yan. Mixture-of-Modules: Reinventing Transformers as Dynamic Assemblies of Modules, Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP’24). Link
-
Ang Lv*, Hongzhan Lin*, Yuhan Chen*, Chen Zhu, Yang Song, Hengshu Zhu, Rui Yan. Mixture of In-Context Experts Enhance LLMs’ Long Context Awareness, Thirty-eighth Annual Conference on Neural Information Processing Systems (NeurIPS’ 24).
-
Ang Lv*, Yuhan Chen*, Ting-En Lin, Changyu Chen, Yuchuan Wu, Fei Huang, Yongbin Li, Rui Yan. Fortify the Shortest Stave in Attention: Enhancing Context Awareness of Large Language Models for Effective Tool Use. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL’24). Link
-
Ang Lv*, Kaiyi Zhang*, Yuhan Chen, Hansen Ha, Tao Xu, Rui Yan. Batch-ICL: Effective, Efficient, and Order-Agnostic In-Context Learning. Findings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL ‘24 Findings). Link
-
Ang Lv, Xu Tan, Tao Qin, Tie-Yan Liu, Rui Yan. Re-creation of Creations: A New Paradigm for Lyric-to-Melody Generation, The 33th International Joint Conference on Artificial Intelligence (IJCAI’24). Link
-
Ang Lv*, Jinpeng Li*, Yuhan Chen, Gao Xing, Ji Zhang, Rui Yan. DialoGPS: Dialogue Path Sampling in Continuous Semantic Space for Data Augmentation in Multi-Turn Conversations. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL’23). Link
-
Ang Lv*, Jinpeng Li*, Shufang Xie, Rui Yan. Envisioning Future from the Past: Hierarchical Duality Learning for Multi-Turn Dialogue Generation. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL’23 oral). Link
-
Ang Lv*, Shufang Xie*, Yingce Xia, Lijun Wu, Tao Qin, Tie-Yan Liu, Rui Yan. Target-Side Input Augmentation for Sequence to Sequence Generation. The 10th International Conference on Learning Representations (ICLR’22). Link
Other Publications
- Changyu Chen, Xiting Wang, Ting-En Lin, Ang Lv, Yuchuan Wu, Xin Gao, Ji-Rong Wen, Rui Yan, Yongbin Li. Masked Thought: Simply Masking Partial Reasoning Steps Can Improve Mathematical Reasoning Learning of Language Models, Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL’24). Link
Preprint Papers
-
Ang Lv*, Tao Tan*, Yining Qian*, Hongzhan Lin, Songhao Wu, Yongbo Wang, Feng Wang, Jingtong Wu, Xin Lu, Rui Yan. PEAR: Position-Embedding-Agnostic Attention Re-weighting Enhances Retrieval-Augmented Generation with Zero Inference Overhead, Arxiv. Link
-
Ang Lv, Ruobing Xie, Xingwu Sun, Zhanhui Kang, Rui Yan. Language Models “Grok” to Copy, Arxiv. Link
-
Ang Lv*, Yuhan Chen*, Kaiyi Zhang*, Yulong Wang, Lifeng Liu, Ji-Rong Wen, Jian Xie, Rui Yan. Interpreting Key Mechanisms of Factual Recall in Transformer-Based Language Models, Arxiv. Link
-
Ang Lv, Xu Tan, Peiling Lu, Wei Ye, Shikun Zhang, Jiang Bian, Rui Yan. GETMusic: Generating Any Music Tracks with a Unified Representation and Diffusion Framework, Arxiv. Link
Honors and Awards
- Supported by the Outstanding Innovative Talents Cultivation Funded Programs 2023 of Renmin University of China, 2023 (中国人民大学拔尖创新人才)
- CCF-Tencent Rhino-Bird Elite Talent Program, 2024(CCF-腾讯犀牛鸟精英人才计划)
Internships
- 2022.09 - 2023.03, Alibaba Damo Academy, Hangzhou.
- 2023.03 - 2023.09, Microsoft Research, Machine Learning Area, mentored by Xu Tan. Our collaborative efforts are dedicated to the Muzic project, which currently boasts 4k stars on GitHub.
- 2023.09 - 2024.05, Alibaba Damo Academy, Beijing.
- 2024.05 - now, Tencent, Beijing.