Liu Zhiyuan
Last reviewed
Jun 8, 2026
Sources
13 citations
Review status
Source-backed
Revision
v1 · 1,694 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
Jun 8, 2026
Sources
13 citations
Review status
Source-backed
Revision
v1 · 1,694 words
Add missing citations, update stale details, or suggest a clearer explanation.
Liu Zhiyuan (Chinese: 刘知远) is a Chinese computer scientist and entrepreneur known for his research in natural language processing, knowledge graphs, and representation learning, and for his work on efficient large language models. He is an associate professor in the Department of Computer Science and Technology at Tsinghua University, where he is a member of the Natural Language Processing group (THUNLP).[1][2] He is a leading figure in OpenBMB, an open-source community for large pre-trained models, and a co-founder and chief scientist of ModelBest (面壁智能), the Beijing startup behind the MiniCPM family of on-device language models.[5] In December 2024 he and his collaborators proposed the "Densing Law," an empirical account of how the capability density of large language models rises over time.[8][9]
As of June 2026, Liu's publications had been cited more than 89,000 times on Google Scholar, with an h-index of 132 and an i10-index of 459, placing him among the most-cited natural language processing researchers based in China.[1]
Liu received his bachelor's degree in computer science from Tsinghua University in 2006 and remained there for doctoral study, completing his PhD in 2011 in the Department of Computer Science and Technology.[3] His doctoral advisor was Sun Maosong (孙茂松), a senior Tsinghua professor and a central figure in Chinese computational linguistics, with whom Liu has continued to publish throughout his career.[3] During his doctorate he worked within the Tsinghua National Laboratory for Information Science and Technology, where his early work centered on keyword extraction, social tagging, and statistical models of text, topics that anchored his later interest in representation learning.[13] His dissertation was recognized with Excellent Doctoral Dissertation awards from both Tsinghua University and the Chinese Association for Artificial Intelligence.[3] After finishing his PhD he joined the Tsinghua faculty, later rising to associate professor in the same department. He has been named to the MIT Technology Review Innovators Under 35 China (MIT TR-35 China) list.[3]
Liu's research spans natural language processing, knowledge graphs, representation learning, pre-trained language models, and social computing. He is a principal member of THUNLP, the Tsinghua group long led by Sun Maosong, and has published well over a hundred papers at venues such as ACL, EMNLP, NeurIPS, and AAAI.[1][3]
Several of his contributions are widely cited. In knowledge graph representation, his 2015 paper introducing TransR ("Learning Entity and Relation Embeddings for Knowledge Graph Completion") became a standard reference for embedding entities and relations in separate vector spaces.[1] In 2019 his group released a model called ERNIE ("Enhanced Language Representation with Informative Entities"), which injected structured knowledge from knowledge graphs into a pre-trained language model; it is distinct from the similarly named ERNIE system developed by Baidu.[1] He co-authored a frequently cited 2020 survey of graph neural networks, and in 2023 he and colleagues published a study of "delta tuning," a unified view of parameter-efficient fine-tuning methods, in Nature Machine Intelligence.[1] More recently his group has worked on language-model agents, including ChatDev, a framework in which communicative agents collaborate to write software, as well as the ToolLLM and AgentVerse projects.[1]
The table below lists several of his most-cited works and their approximate Google Scholar citation counts as of June 2026.[1]
| Paper (year) | Topic | Citations (approx.) |
|---|---|---|
| Graph Neural Networks: A Review of Methods and Applications (2020) | Graph neural network survey | 10,000+ |
| Learning Entity and Relation Embeddings for Knowledge Graph Completion (2015) | Knowledge graph embedding (TransR) | 5,400+ |
| ERNIE: Enhanced Language Representation with Informative Entities (2019) | Knowledge-augmented pre-training | 2,200+ |
| Parameter-efficient Fine-tuning of Large-scale Pre-trained Language Models (2023) | Delta tuning | 2,100+ |
| ChatDev: Communicative Agents for Software Development (2024) | LLM agents | 1,900+ |
Alongside research papers, Liu has emphasized open-source software and educational materials. His group has released widely used toolkits including OpenKE for knowledge embedding, OpenNRE for neural relation extraction, THULAC for Chinese lexical analysis, and OpenHowNet for sememe-based linguistic knowledge.[2] With Yankai Lin and Sun Maosong he wrote the textbook "Representation Learning for Natural Language Processing," published by Springer in 2020 with an expanded second edition in 2023, which surveys representation methods from word embeddings through pre-trained language models.[3]
Liu is one of the founders and organizers of OpenBMB, short for "Open Lab for Big Model Base," an open-source community launched in 2022 and supported by THUNLP together with ModelBest.[6] OpenBMB's stated goal is to build a model base and toolkit for large-scale pre-trained models and to lower the barriers to training, tuning, and running models with more than ten billion parameters.[6] Its core toolkits include BMTrain for efficient pre-training and fine-tuning, BMInf for low-resource inference, BMCook for model compression, and the OpenPrompt and OpenDelta libraries for prompt-based and parameter-efficient tuning.[6][7] OpenBMB has reported that BMTrain can cut training cost substantially compared with frameworks such as DeepSpeed, and that BMInf can run models with more than ten billion parameters on a single consumer-grade GPU.[6][7]
The community also released a series of Chinese pre-trained language models under the CPM name, beginning with CPM in 2020 and continuing with the eleven-billion-parameter CPM-2 in 2021, work that fed directly into the design philosophy behind later on-device models.[6] OpenBMB now hosts the MiniCPM models and related tools, making it the open-source counterpart to Liu's commercial work at ModelBest.[10]
In August 2022, Liu co-founded ModelBest (面壁智能), a startup spun out of the Tsinghua NLP group to commercialize large-model research.[5] He serves as the company's co-founder and chief scientist, while Li Dahai (李大海), a former chief technology officer at the question-and-answer platform Zhihu, is co-founder and chief executive.[5] ModelBest has positioned itself differently from many Chinese AI firms: rather than chasing ever-larger cloud models, it concentrates on efficiency and on running capable models directly on edge devices such as phones, personal computers, cars, and smart-home hardware.[5][11]
The company's flagship product line is MiniCPM, a family of compact yet capable models. MiniCPM-2B was released on February 1, 2024, with about 2.4 billion parameters, and was reported to perform comparably to much larger models such as Mistral-7B and Llama2-13B on public benchmarks while being small enough to run on a phone.[10] Liu has framed the approach in plain terms, arguing that "the capabilities of a 13B model can obviously be achieved with a 2B model" that runs quickly on the device side.[4] Later releases extended the family to multimodal and speech tasks, including the MiniCPM-V vision-language models designed for offline use on mobile devices, MiniCPM3-4B in September 2024 (reported to surpass GPT-3.5-Turbo on aggregate benchmarks), and, by late 2025, the MiniCPM 4.x text models, the MiniCPM-V 4.5 multimodal model, and the VoxCPM speech generator.[10][12] ModelBest has said cumulative downloads of the MiniCPM series passed twenty-four million across GitHub and Hugging Face, and that the models have been deployed with carmakers and device makers including Geely, Changan, Volkswagen, and Huawei.[12]
ModelBest has raised multiple rounds of financing from Chinese investors. Backers have included Hillhouse Capital and other strategic and state-linked funds, and in December 2025 the company closed a further round of several hundred million yuan earmarked for research on efficient on-device models and for commercial deployment.[12] Liu has also become a public voice for China's open-source AI strategy, telling MIT Technology Review in February 2026 that "compute and energy are real constraints for any deployment" and that Chinese firms had "seen real gains from the open-source playbook."[11]
In December 2024, Liu and colleagues at Tsinghua and ModelBest posted a paper titled "Densing Law of LLMs," led by Chaojun Xiao and listing Liu and Sun Maosong among its authors.[8] The paper introduces "capability density" (also rendered as capacity density) as a way to measure how much capability a model packs into each unit of parameters. It is defined as the ratio of a model's effective parameter size to its actual parameter size, where the effective parameter size is the minimum number of parameters a reference model would need in order to match the target model's performance.[8][9]
Analyzing a series of widely used open models, the authors report an empirical regularity they call the densing law: the maximum capability density of large language models grows exponentially, doubling roughly every few months. The peer-reviewed version, published in Nature Machine Intelligence in November 2025, states a doubling period of approximately 3.5 months; the original preprint and much of the popular coverage described the trend as doubling about every three months, or roughly every hundred days.[8][9] The practical implication is that the number of parameters needed to reach a given level of capability falls exponentially over time, so that models small enough to run on a phone steadily catch up to what only large cloud models could do a year or two earlier.[9]
The law dovetails with ModelBest's on-device thesis, providing a quantitative argument that efficient small models are a moving frontier rather than a niche compromise. Liu has used the idea to argue that rising capability density, combined with continued hardware progress, will make increasingly powerful AI feasible on everyday devices.[5][11]