Yekun Chai

Contact:
chaiyekun (at) gmail.com

I am a Staff Researcher at Baidu NLP on the ERNIE team.

My research focuses on foundation models, spanning pre-training, post-training, and reasoning. I am particularly interested in efficient scaling paradigms, the training dynamics of large-scale models, and reinforcement learning–based methods.

I welcome collaboration and research discussions.

news

Aug 21, 2025	Four papers have been accepted to EMNLP 2025 & Findings.
May 16, 2025	One paper on curiosity-driven RLHF has been accepted to ACL 2025.
Jan 23, 2025	One paper on macro action RLHF has been accepted to ICLR 2025. Dive into our research and code now!
Sep 21, 2024	Our papers on PixelGPT, GPTfluence, and TKEval have been accepted to EMNLP 2024 & Findings.
May 02, 2024	One paper on GiLOT, an XAI approach for LLMs, has been accepted to ICML 2024.

selected publications

ICLR

MA-RLHF: Reinforcement Learning from Human Feedback with Macro Actions

Yekun Chai^*†, Haoran Sun^{*^}, Huang Fang, Shuohuan Wang, and 2 more authors

In The Thirteenth International Conference on Learning Representations, 2025

TL;DR HTML PDF Code

First to extend RLHF with temporal abstraction of tokens via macro-actions (options / Semi-MDP) to improve credit assignment for long-horizon generation.
ICLRSpotlight

Tool-Augmented Reward Modeling

Lei Li^{*^}, Yekun Chai^*†, Shuohuan Wang , Yu Sun, and 3 more authors

In The Twelfth International Conference on Learning Representations (top 5%) , 2024

TL;DR HTML PDF Code Poster Slides

First agentic reward model with tool-calling capabilities for RL-tuning.

Spotlight
ACL-Findings

ERNIE-Code: Beyond English-Centric Cross-lingual Pretraining for Programming Languages

Yekun Chai, Shuohuan Wang, Chao Pang , Yu Sun, and 2 more authors

In Findings of the Association for Computational Linguistics: ACL 2023 , Jul 2023

Abs TL;DR HTML PDF Code

First to build a unified multilingual code LLM bridging many natural languages and many programming languages.

Software engineers working with the same programming language (PL) may speak different natural languages (NLs) and vice versa, erecting huge barriers to communication and working efficiency. Recent studies have demonstrated the effectiveness of generative pre-training in computer programs, yet they are always English-centric. In this work, we step towards bridging the gap between multilingual NLs and multilingual PLs for large language models (LLMs). We release ERNIE-Code, a unified pre-trained language model for 116 NLs and 6 PLs. We employ two methods for universal cross-lingual pre-training: span-corruption language modeling that learns patterns from monolingual NL or PL; and pivot-based translation language modeling that relies on parallel data of many NLs and PLs. Extensive results show that ERNIE-Code outperforms previous multilingual LLMs for PL or NL across a wide range of end tasks of code intelligence, including multilingual code-to-text, text-to-code, code-to-code, and text-to-text generation. We further show its advantage of zero-shot prompting on multilingual code summarization and text-to-text translation. We release our code and pre-trained checkpoints.

academic services

Conference PC/Reviewer	ACL, EMNLP, ICLR, NeurIPS, NAACL, COLING, EACL, COLM, ICASSP, LREC
Journal Reviewer	TASLP