Yekun Chai

Baidu NLP

london.jpg

Contact:
chaiyekun (at) gmail.com

I am a staff researcher at Baidu NLP, on the ERNIE team.

I work on pre-training, post-training, and reasoning research. My research endeavors revolve around natural language processing and beyond, specifically on (1) scaling transformers across languages, modalities, and tasks; (2) efficient alignment, reasoning, and inference at scale.

I have contributed to Baidu’s large language model series, incl., ERNIE 5.0, 4.0, 3.5 and ERNIE-Code, and their generative AI products, e.g., ERNIE-Bot (文心一言, 2023), Baidu Comate (文心快码, 2022). Before that, I honed my skills in RL and NLP at Institute of Automation, Chinese Academy of Sciences. I pursued my academic studies in NLP at University of Edinburgh, under the supervision of Adam Lopez and Naomi Saphra.

news

Jan 23, 2025 One paper on MA-RLHF has been accepted to ICLR 2025:bear:. Dive into our research and code now! :fire:
Sep 21, 2024 Our papers on PixelGPT, GPTfluence, and TKEval have been accepted to EMNLP 2024 & Findings. :bear:
May 02, 2024 One paper on GiLOT, an XAI approach for LLMs, has been accepted to ICML 2024. :snowflake:
Feb 20, 2024 One paper on HumanEval-XL, a multilingual code generation benchmark has been accepted to LREC-COLING 2024. We’ve released the code and data.
Jan 16, 2024 One paper on tool-augmented reward models has been accepted to ICLR 2024 (spotlight):sparkles:. Dive into our research and code now! :fire:

selected publications

  1. ICLR
    MA-RLHF: Reinforcement Learning from Human Feedback with Macro Actions
    Yekun Chai*†, Haoran Sun*^Huang FangShuohuan Wang, and 2 more authors
    In The Thirteenth International Conference on Learning Representations, 2025
  2. EMNLP
    Autoregressive Pre-Training on Pixels and Texts
    Yekun Chai, Qingyi Liu^, Jingwu Xiao^Shuohuan Wang, and 2 more authors
    In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, Nov 2024
  3. ICLRSpotlight
    Tool-Augmented Reward Modeling
    Lei Li*^Yekun Chai*†Shuohuan Wang, Yu Sun, and 3 more authors
    In The Twelfth International Conference on Learning Representations(top 5%) , Nov 2024
  4. ACL-Findings
    ERNIE-Code: Beyond English-Centric Cross-lingual Pretraining for Programming Languages
    Yekun ChaiShuohuan Wang, Chao Pang, Yu Sun, and 2 more authors
    In Findings of the Association for Computational Linguistics: ACL 2023 , Jul 2023

academic services

Conference PC/Reviewer ACL, EMNLP, ICLR, NeurIPS, NAACL, COLING, EACL, ICASSP, ARR, COLM
Journal Reviewer TASLP