ArXiv TLDR

Baichuan 2: Open Large-scale Language Models

🐦 Tweet
2309.10305

Aiyuan Yang, Bin Xiao, Bingning Wang, Borong Zhang, Ce Bian + 50 more

cs.CL

TLDR

Baichuan 2 is a series of large-scale, open-source multilingual language models that achieve state-of-the-art performance across general and specialized benchmarks.

Key contributions

  • Developed 7B and 13B parameter multilingual LLMs trained on 2.6 trillion tokens from scratch.
  • Outperforms or matches similar-sized open-source models on benchmarks like MMLU, CMMLU, GSM8K, and HumanEval.
  • Demonstrates strong capabilities in specialized domains such as medicine and law.
  • Plans to release all pre-training checkpoints to support research on training dynamics.

Why it matters

This paper matters because it provides the research community with powerful, open-source multilingual language models that rival closed-source alternatives, broadening access to advanced LLM capabilities beyond English and enabling further innovation in both general and domain-specific NLP applications.

Original Abstract

Large language models (LLMs) have demonstrated remarkable performance on a variety of natural language tasks based on just a few examples of natural language instructions, reducing the need for extensive feature engineering. However, most powerful LLMs are closed-source or limited in their capability for languages other than English. In this technical report, we present Baichuan 2, a series of large-scale multilingual language models containing 7 billion and 13 billion parameters, trained from scratch, on 2.6 trillion tokens. Baichuan 2 matches or outperforms other open-source models of similar size on public benchmarks like MMLU, CMMLU, GSM8K, and HumanEval. Furthermore, Baichuan 2 excels in vertical domains such as medicine and law. We will release all pre-training model checkpoints to benefit the research community in better understanding the training dynamics of Baichuan 2.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.