Chen Zhang 张晨

Incoming Project Researcher @ LLMC, NII

I obtained my Ph.D. degree in Wangxuan Institute of Computer Technology, Peking University, advised by Prof. Yansong Feng in 2026.

My research aims to make LLMs more inclusive and intelligent across diverse human languages.

Research Interests

NLP for Low-Resource Languages: Enhancing the transparency, inclusivity, efficiency, and cultural awareness of language technologies for underrepresented languages.
Multilinguality in LLMs: Investigating the emergence of multilingual capabilities in LLMs and understanding cross-lingual interactions and transfer mechanisms.
LLMs for Language Science: Leveraging the metalinguistic abilities of LLMs to analyze phenomena such as language contact and acquisition, for insights into how human languages function and evolve.

News

[MAY 2026] Obtained a PhD degree from Peking University! 🎓
[APR 2026] Our paper on logit fusion for low-resource languages is accepted to ACL 2026.
[JAN 2026] The Mongolian script subset of our MiLiC-Eval has been officially integrated into FLORES+.
[NOV 2025] Involved in organizing the WiNLP 2025 workshop at EMNLP 2025.

Contact Wangxuan Institute of Computer Technology, Peking University
No. 128 Zhongguancun North Street
Haidian District, Beijing, 100871

zhangch [at] pku [dot] edu [dot] cn

* denotes equal contribution.
Following the #BenderRule, languages are specified for each work.

2026

To Reason or to Fabricate: Reasoning Without Shortcuts via Hint-Anchored Pairwise Aggregation arXiv 2606.29481
Jiuheng Lin, Chen Zhang, Yansong Feng
[preprint]
English

Efficient Low-Resource Language Adaptation via Multi-Source Dynamic Logit Fusion ACL 2026
Chen Zhang, Jiuheng Lin, Zhiyuan Liao, Yansong Feng
[paper] [code]
Uyghur, Tibetan, Mongolian, Kazakh, Odia, Telugu, Tamil, Bengali

An Empirical Study of Many-Shot In-Context Learning for Machine Translation of Low-Resource Languages arXiv 2604.02596
Yinhan Lu, Gaganpreet Jhajj, Chen Zhang, Anietie Andy, David Ifeoluwa Adelani
[preprint]
Anaang, Efik, Ibibio, Oro, Sudanese Arabic, Emakhuwa, Ladin, Mauritian Creole, Tamazight, Quechua

2025

Read it in Two Steps: Translating Extremely Low-Resource Languages with Code-Augmented Grammar Books ACL 2025
Chen Zhang*, Jiuheng Lin*, Xiao Liu, Zekai Zhang, Yansong Feng
[paper] [code]
Zhuang, Kalamang

Cross-Lingual Transfer of Cultural Knowledge: An Asymmetric Phenomenon ACL 2025
Chen Zhang, Zhiyuan Liao, Yansong Feng
[paper] [code]
English, Chinese, Korean, Tibetan, Mongolian

MiLiC-Eval: Benchmarking Multilingual LLMs for China's Minority Languages ACL 2025 (Findings)
Chen Zhang, Mingxu Tao, Zhiyuan Liao, Yansong Feng
[paper] [code] [huggingface]
Tibetan, Uyghur, Kazakh, Mongolian

Eliciting and Improving the Causal Reasoning Abilities of Large Language Models with Conditional Statements Computational Linguistics 2025
Xiao Liu, Da Yin, Chen Zhang, Yansong Feng, Dongyan Zhao
[paper]
English

2024

Unlocking the Potential of Model Merging for Low-Resource Languages EMNLP 2024 (Findings)
Mingxu Tao*, Chen Zhang*, Quzhe Huang*, Tianyao Ma, Songfang Huang, Dongyan Zhao, Yansong Feng
[paper] [huggingface]
Tibetan, Uyghur, Mongolian, Tamil, Telugu, Odia, Bengali

Teaching Large Language Models an Unseen Language on the Fly ACL 2024 (Findings)
Chen Zhang, Xiao Liu, Jiuheng Lin, Yansong Feng
[paper] [code] [website]
Zhuang, Kalamang, and other 7 mid-resource languages

MC²: Towards Transparent and Culturally-Aware NLP for Minority Languages in China ACL 2024
Chen Zhang*, Mingxu Tao*, Quzhe Huang*, Jiuheng Lin*, Zhibin Chen, Yansong Feng
[paper] [code] [website]
Tibetan, Uyghur, Kazakh, Mongolian

Harder Task Needs More Experts: Dynamic Routing in MoE Models ACL 2024
Quzhe Huang*, Zhenwei An*, Nan Zhuang, Mingxu Tao, Chen Zhang, Yang Jin, Kun Xu, Kun Xu, Liwei Chen, Songfang Huang, Yansong Feng
[paper]
English

Can LLMs Learn a New Language on the Fly? A Case Study on Zhuang ICLR 2024 Tiny Paper
Chen Zhang, Mingxu Tao, Quzhe Huang, Zhibin Chen, Yansong Feng
[paper]
Zhuang

Can Perplexity Reflect Large Language Model's Ability in Long Text Understanding? ICLR 2024 Tiny Paper
Yutong Hu, Quzhe Huang, Mingxu Tao, Chen Zhang, Yansong Feng
[paper]
English

2023

Lawyer LLaMA: Enhancing LLMs with Legal Knowledge arXiv 2305.15062
Quzhe Huang*, Mingxu Tao*, Chen Zhang*, Zhenwei An*, Cong Jiang, Zhibin Chen, Zirui Wu, Yansong Feng
[preprint] [code]
Chinese

How Many Answers Should I Give? An Empirical Study of Multi-Answer Reading Comprehension ACL 2023 (Findings)
Chen Zhang, Jiuheng Lin, Xiao Liu, Yuxuan Lai, Yansong Feng, Dongyan Zhao
[paper] [code]
English

The Magic of IF: Investigating Causal Reasoning Abilities in Large Language Models of Code ACL 2023 (Findings)
Xiao Liu, Da Yin, Chen Zhang, Yansong Feng, Dongyan Zhao
[paper] [code]
English

Relation-Aware Question Answering for Heterogeneous Knowledge Graphs EMNLP 2023 (Findings)
Haowei Du, Quzhe Huang, Chen Li, Chen Zhang, Yang Li, Dongyan Zhao
[paper]
English

Cross-Lingual Question Answering over Knowledge Base as Reading Comprehension EACL 2023 (Findings)
Chen Zhang, Yuxuan Lai, Yansong Feng, Xingyu Shen, Haowei Du, Dongyan Zhao
[paper] [code]
Chinese, Persian, German, Romanian, Italian, Russian, French, Dutch, Spanish, Hindi, Portuguese

UnifEE: Unified Evidence Extraction for Fact Verification EACL 2023
Nan Hu, Zirui Wu, Yuxuan Lai, Chen Zhang, Yansong Feng
[paper] [code]
English

2022 and before

Knowledge-Enhanced Iterative Instruction Generation and Reasoning for Knowledge Base Question Answering NLPCC 2022
Haowei Du, Quzhe Huang, Chen Zhang, Dongyan Zhao
[paper] [preprint] English

Extract, Integrate, Compete: Towards Verification Style Reading Comprehension EMNLP 2021 (Findings)
Chen Zhang, Yuxuan Lai, Yansong Feng, Dongyan Zhao
[paper] [code]
Chinese

A review of deep learning in question answering over knowledge bases AI Open 2021, Volume 2
Chen Zhang, Yuxuan Lai, Yansong Feng, Dongyan Zhao
[paper]
English

Why Machine Reading Comprehension Models Learn Shortcuts? ACL-IJCNLP 2021 (Findings)
Yuxuan Lai, Chen Zhang, Yansong Feng, Quzhe Huang, Dongyan Zhao
[paper] [code]
English

Academic Service

Area Chair: ACL Rolling Review, LREC
Session Chair: ACL 2026
Reviewer: ACL Rolling Review (Great Reviewer x3), ACL, EMNLP, LREC, COLING, *ACL Demonstration Track, NLPCC (Best Reviewer Award 2025), NeurIPS
Workshop Organizer: WiNLP 2025 & 2026
Student Volunteer: EMNLP 2021 (remote), ACL 2024 & 2025

Open-Source Artifacts

ZhuangBench: A benchmark consisting of a small set of Zhuang–Chinese parallel sentences and a Zhuang dictionary, designed to evaluate whether LLMs can comprehend an unseen language on the fly. Now part of LongBench v2.
[github] [paper]
ZhuangRules: A collection of Zhuang grammar rules for evaluating whether LLMs can effectively leverage grammar books in low-resource language understanding.
[github] [paper]
MC²: A web-crawled corpus covering four minority languages in China: Tibetan, Uyghur, Kazakh, and Mongolian.
[github] [huggingface] [paper]
MiLiC-Eval: A multi-task evaluation benchmark for four minority languages in China: Tibetan, Uyghur, Kazakh, and Mongolian. Portions of the data have been contributed to FLORES+ to support alternative scripts.
[github] [huggingface] [paper]
Lawyer LLaMA: A Chinese legal-domain LLM based on Llama 2 trained on synthetic data of legal consultations.
[github] [huggingface] [report]

Teaching

Teaching Assistant @ Peking University

Foundations of Natural Language Processing (Spring 2024, 2025, 2026)
Empirical Methods for Natural Language Processing (Spring 2022)
Data Structures and Algorithms (B) (Fall 2020, Spring 2021)

Honors & Awards

Outstanding Graduate (北京大学优秀毕业生), Peking University (2021, 2026)
President Scholarship (校长奖学金), Peking University (2025)
Award for Scientific Research (优秀科研奖), Peking University (2024)
Outstanding Graduate of Beijing (北京市优秀毕业生), Beijing Municipal Education Commission (2021)
Founder Scholarship (方正奖学金), Peking University (2018, 2019)

My name in Chinese characters is 张晨 (Zhāng Chén, /ʈʂɑŋ˥˥ ʈʂʰən˧˥/). My given name 晨 (Chén) means morning.
My mother tongue is the Jinsha Dialect (金沙话), a transitional dialect between Mandarin and Wu Chinese.
Intermediate in Japanese and Spanish; basic knowledge of Korean, German, and Uzbek.
I enjoy hiking, learning new languages, and exploring craft beers (especially IPAs).