Jiang Daxin
Last reviewed
Jun 8, 2026
Sources
10 citations
Review status
Source-backed
Revision
v1 · 1,446 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
Jun 8, 2026
Sources
10 citations
Review status
Source-backed
Revision
v1 · 1,446 words
Add missing citations, update stale details, or suggest a clearer explanation.
Jiang Daxin (Chinese: 姜大昕) is a Chinese computer scientist and technology entrepreneur who is the founder and chief executive officer of StepFun (Chinese: 阶跃星辰), a Shanghai-based artificial general intelligence company that develops the Step family of multimodal foundation models. Before starting the company in 2023, he spent about 16 years at Microsoft, where he rose to corporate vice president and chief scientist of the Software Technology Center Asia (STCA) and led natural language processing and search work behind products such as the Bing search engine and the Cortana voice assistant. [1][2][4]
StepFun, founded in April 2023, is counted among China's so-called "AI tigers," a cohort of well funded large model startups also known as the "six little dragons." Under Jiang the company has bet heavily on multimodal AI and on scaling laws, releasing more than two dozen Step models across text, image, audio, and video and attracting backing from Tencent and Shanghai state-owned investors that, by early 2026, had pushed its cumulative funding past 1.5 billion dollars. [1][6]
Jiang earned a PhD in computer science from the University at Buffalo, part of the State University of New York, around 2005. [1][4][5] His doctoral and early research interests spanned machine learning, data mining, natural language processing, and bioinformatics. [4][5] After completing his doctorate he began his academic career in Singapore as an assistant professor in the School of Computer Science and Engineering at Nanyang Technological University. [4]
In 2007 Jiang joined Microsoft Research Asia (MSRA) in Beijing as a researcher, beginning a Microsoft career that would last roughly 16 years. [1][5] His early work at the lab continued across machine learning, data mining, and language technologies, and in 2008 he received the ACM SIGKDD Best Application Paper Award. [4][5]
Around 2011 Jiang moved from the pure research organization into the Software Technology Center Asia, Microsoft's applied engineering arm in China that built search and online services for the Asia-Pacific region. [5] Over the following decade he oversaw the natural language and search technologies behind Bing, the Cortana voice assistant, Azure cognitive services, and language understanding features in Microsoft 365. [1][5] In 2017 he was named a Microsoft partner, serving as a deputy managing director and chief scientist of STCA, and he was promoted to corporate vice president in early 2023. [1][5] Over his career he has published close to 200 papers in international conferences and journals, and he was elected an IEEE Fellow in 2024. [4]
Jiang has said that the public release of ChatGPT by OpenAI in November 2022 convinced him to strike out on his own. "I thought, I can do it myself, maybe even better," he later recalled. [1] He left Microsoft and, on April 6, 2023, registered the company in the Xuhui district of Shanghai under the legal name Shanghai Jieyue Xingchen Intelligent Technology. [1][7] The Chinese name 阶跃星辰 pairs the engineering term for a step change with the word for stars, reflecting the founders' technical roots; the company uses StepFun in English. He was joined by two former Microsoft colleagues, Jiao Binxing and Zhu Yibo, who led the company's search and systems efforts respectively. [1]
StepFun is grouped with China's "AI tigers" (also called the "six little dragons" of large models), a set of well financed foundation model startups that also includes Zhipu AI, MiniMax, Moonshot AI, Baichuan, and 01.AI. [6] Jiang framed the venture around artificial general intelligence from the outset and has been an outspoken believer in scaling laws, the principle that a model's performance improves predictably as compute, data, and model size grow. [2] He describes a staged path to AGI: "We believe that AI must go from unimodal to multimodal, to embodied intelligence, and finally to AGI." [1] That thesis made multimodality, the ability to handle text, image, audio, and video jointly, the company's central bet, and it has guided StepFun toward a single-system multimodal AI approach rather than a single consumer chatbot. [1][2]
StepFun unveiled its first Step series models in March 2024, led by Step-1, a hundred-billion-parameter large language model, alongside the Step-1V multimodal model and a preview of the much larger Step-2. [10] At the World Artificial Intelligence Conference in July 2024 it released Step-2, a mixture-of-experts model that it described as the first trillion-parameter language model built by a Chinese company, together with the Step-1.5V multimodal model and the Step-1X image generator. [3][8] By November 2024 Step-2 ranked first among Chinese models and fifth globally on the LiveBench benchmark. [8]
The company then expanded across modalities and into reasoning. In February 2025 it open-sourced the Step-Video-T2V text-to-video model and the Step-Audio speech model, and in April 2025 it added Step-R1-V-Mini, a multimodal reasoning model. [3] Its flagship Step-3, shown at the 2025 World Artificial Intelligence Conference in July, is a 321-billion-parameter mixture-of-experts model with about 38 billion active parameters that introduced efficiency techniques the company called Multi-Matrix Factorization Attention and Attention-FFN Disaggregation, both intended to cut inference cost and to run well on domestic chips. [3] By early 2026 StepFun said it had released 29 models, a growing number of them published as open weights, including the Apache 2.0 licensed Step-3.7-Flash. [3][6]
| Model | Released | Notes |
|---|---|---|
| Step-1 / Step-1V | March 2024 | First LLM (about 100B parameters) and multimodal model |
| Step-2 | July 2024 | Trillion-parameter MoE; described as first by a Chinese company |
| Step-1.5V / Step-1X | July 2024 | Multimodal model and image generator |
| Step-Video-T2V / Step-Audio | February 2025 | Open-sourced text-to-video and speech models |
| Step-R1-V-Mini | April 2025 | Multimodal reasoning model |
| Step-3 | July 2025 | 321B-parameter MoE (38B active); MFA and AFD techniques |
| Step-3.5-Flash | February 2026 | 196B parameters, 11B active |
| Step-3.7-Flash | May 2026 | Apache 2.0 open weights |
StepFun raised capital quickly with backing from Tencent and prominent venture and state-backed investors. Across early rounds in 2023 and 2024 it took in hundreds of millions of dollars from Tencent, Qiming Venture Partners, and 5Y Capital, reaching a reported valuation of about 2 billion dollars. [7][9] A Series B round announced in December 2024, which drew Shanghai state-owned capital alongside Tencent, Qiming, and 5Y Capital, confirmed its unicorn status. [7] In January 2026 the company disclosed a Series B+ round of more than 5 billion yuan (about 717 million dollars), led by Shanghai State-owned Capital Investment, with Tencent increasing its stake and new investors including China Life Insurance, Shanghai Pudong Venture Capital, and Huaqin Technology. [6] That single raise was larger than the Hong Kong listing proceeds of fellow AI tigers Zhipu AI and MiniMax earlier in the same month, and reports in 2026 said StepFun was itself weighing a Hong Kong Stock Exchange listing. [3][6]
Rather than chase a single consumer app, StepFun has pushed its models onto devices and into partner products. It open-sourced its Step-Video and Step-Audio models in February 2025 in cooperation with the automaker Geely, agreed a data-sharing arrangement with the robotics firm AgiBot in March 2025, and deployed on-device agents with smartphone makers including Honor, Oppo, and ZTE, reaching millions of terminals. [3][6] Constrained by United States restrictions on advanced Nvidia chips, Jiang has publicly called the limits "manageable," and in July 2025 StepFun helped launch a Model-Chip Ecosystem Innovation Alliance with domestic chipmakers Huawei, Biren, Moore Threads, and Enflame to co-design models and silicon. [2][3] As of 2026 Jiang remains chief executive of StepFun, one of the more research-driven of China's large model companies and a leading proponent of the view that scaling multimodal systems is the most direct route to artificial general intelligence. [1][6]