Liang Wenfeng (Chinese: 梁文锋; born 1985) is a Chinese entrepreneur, computer scientist, and billionaire who founded DeepSeek, one of the world's most influential artificial intelligence companies. He is also the co-founder of High-Flyer (Huanfang Quantitative), a Chinese quantitative hedge fund that grew to manage over 100 billion yuan (approximately $14 billion) in assets by 2021. In January 2025, DeepSeek's release of its R1 reasoning model sent shockwaves through the global technology sector, erasing nearly $600 billion from Nvidia's market capitalization in a single day and prompting comparisons to the Sputnik moment of 1957.
Liang is known for his low-profile personality, his conviction that artificial general intelligence (AGI) requires long-term foundational research rather than short-term commercial pursuits, and his commitment to open-source AI development. As of 2025, Forbes estimated his net worth at approximately $11.5 billion.
Liang Wenfeng was born in 1985 in the village of Mililing (米历岭村), Tanba town (覃巴镇), in the city of Wuchuan (吴川市), which falls under the administrative area of Zhanjiang in Guangdong province, China. His parents were both primary school teachers at a local village school. Growing up in a modest rural household, Liang developed a strong interest in mathematics from an early age. Classmates and teachers from his middle school years at Wuchuan No. 1 Middle School recall him as a top student who was also fond of reading comic books. He reportedly taught himself high school and university-level mathematics courses while still in secondary school, displaying the kind of intellectual curiosity that would later define his career.
Liang earned the highest score on the college entrance examination (gaokao) in the Zhanjiang region of Guangdong. At the age of 17, he was admitted to Zhejiang University, one of China's most prestigious universities, located in Hangzhou. He earned a Bachelor of Engineering degree in Electronic Information Engineering in 2007. Choosing to remain at Zhejiang University for graduate studies, he completed a Master of Engineering degree in Information and Communication Engineering in 2010. His master's thesis, titled "Study on Object Tracking Algorithm Based on Low-Cost PTZ Camera," focused on developing target-tracking algorithms for affordable pan-tilt-zoom surveillance camera systems. The work applied computer vision techniques to a practical engineering challenge, reflecting his early orientation toward using computational methods to solve real-world problems. His time at Zhejiang University also proved important for building the personal relationships that would later form the core of his business ventures.
During his time at Zhejiang University, Liang and several classmates became interested in algorithmic trading. The 2008 global financial crisis, which they witnessed as graduate students, helped shape their ideas about using machine learning and statistical models to predict market behavior. The turmoil in global markets presented both a cautionary tale and an opportunity: traditional investment strategies had failed many firms, while quantitative approaches showed promise for those with the technical skills to build them. After completing his master's degree in 2010, Liang began his career by applying artificial intelligence techniques to financial markets, exploring how quantitative models could generate consistent investment returns independent of human bias.
In February 2016, Liang Wenfeng and two of his classmates from Zhejiang University formally incorporated Ningbo High-Flyer Quantitative Investment Management Partnership (Limited Partnership), doing business as High-Flyer. The company, also known by its Chinese name Huanfang (幻方), is headquartered in Hangzhou, Zhejiang. High-Flyer specialized in AI-driven quantitative trading strategies, using neural networks and machine learning algorithms to analyze market data and make investment decisions.
High-Flyer's approach was distinctive in the Chinese hedge fund landscape. Rather than relying on traditional fundamental or technical analysis, the firm built its entire investment strategy around AI models that could process vast amounts of market data and identify patterns invisible to human analysts. This approach proved highly successful, and the fund grew rapidly throughout the late 2010s.
| Year | Milestone |
|---|---|
| 2016 | High-Flyer incorporated in Ningbo; begins AI-driven quantitative trading |
| 2019 | Assets under management surpass 10 billion yuan; High-Flyer AI research division established |
| 2020 | Fire-Flyer I supercomputer deployed with 1,100 Nvidia A100 GPUs (cost: 200 million yuan) |
| 2021 | AUM peaks at approximately 100 billion yuan (~$14 billion); Fire-Flyer II deployed with 10,000 Nvidia A100 GPUs (cost: 1 billion yuan) |
| 2023 | High-Flyer pivots strategic focus toward fundamental AI research; DeepSeek spun out as a separate entity |
| 2025 | High-Flyer funds deliver average returns of approximately 57% for the year |
By 2019, High-Flyer had accumulated over 10 billion yuan in assets under management and established a dedicated AI research division called High-Flyer AI, which focused on AI algorithms and their foundational applications. That same year, the firm began investing heavily in computing infrastructure, recognizing that large-scale GPU clusters would be essential for both quantitative trading and broader AI research. The investment in GPUs reflected Liang's growing conviction that artificial intelligence would require enormous computing resources and that early movers who secured hardware would have lasting advantages.
One of Liang's most consequential decisions was his early and aggressive investment in GPU computing hardware. In 2020, High-Flyer deployed its first supercomputer, Fire-Flyer I, which contained 1,100 Nvidia A100 GPUs and cost approximately 200 million yuan (around $28 million). This cluster was used to power AI-driven research for the firm's quantitative trading operations.
In 2021, Fire-Flyer I was retired and replaced by the significantly larger Fire-Flyer II, which housed approximately 10,000 Nvidia A100 GPUs across roughly 1,250 GPU compute nodes and nearly 200 storage servers. The cluster cost approximately 1 billion yuan (around $140 million) and featured 200 Gbps high-bandwidth connectivity. A research paper published in 2024 at the SC24 supercomputing conference described the Fire-Flyer 2 system's design, noting that it achieved performance approximating Nvidia's own DGX-A100 reference architecture while reducing costs by half and energy consumption by 40%. The paper detailed a custom software-hardware co-design approach that allowed High-Flyer to extract maximum performance from PCIe-connected A100 GPUs rather than the more expensive SXM form factor typically used in data centers.
The timing of these GPU purchases proved critical. In October 2022, the United States government imposed export controls that restricted the sale of advanced AI chips, including the Nvidia A100 and H100, to China. Because High-Flyer had completed its procurement and deployment of 10,000 A100 GPUs well before these restrictions took effect, the firm possessed one of the largest private GPU clusters in China. This computing infrastructure would later become the foundation for DeepSeek's AI research.
In May 2023, Liang Wenfeng announced that High-Flyer would pursue the development of artificial general intelligence and formally launched DeepSeek (officially registered as Hangzhou DeepSeek Artificial Intelligence Basic Technology Research Co., Ltd.) as a separate entity. The company is headquartered in Hangzhou, Zhejiang, and Liang serves as its CEO.
DeepSeek's founding reflected a significant strategic shift. While High-Flyer had used AI primarily as a tool for generating investment returns, DeepSeek was established with the explicit mission of conducting fundamental AI research aimed at achieving AGI. Liang funded DeepSeek through High-Flyer's profits, noting that venture capital firms were reluctant to invest because the company was unlikely to generate a quick financial exit. This self-funding model gave DeepSeek unusual independence, freeing it from the short-term performance pressures that typically accompany venture capital investment.
The company recruited a team of researchers, many of them recent graduates from top Chinese universities. The V3 technical report, published in December 2024, credited approximately 150 researchers and engineers along with a 31-person data automation team, for a total of roughly 181 people. Reports from various sources have described the overall team size as approximately 200 researchers, though the exact total number of employees has not been publicly disclosed. This relatively small headcount stood in sharp contrast to the thousands of employees at Western AI labs such as OpenAI, Google DeepMind, and Anthropic.
Under Liang's leadership, DeepSeek released a series of increasingly capable AI models at a rapid pace, maintaining an approximately seven-month cadence between major releases.
| Model | Release Date | Key Details |
|---|---|---|
| DeepSeek Coder | November 2, 2023 | First model release; code-focused; MIT License |
| DeepSeek-LLM (67B) | November 29, 2023 | General-purpose large language model; 67 billion parameters |
| DeepSeek-V2 | May 2024 | Mixture of Experts architecture; reduced training costs |
| DeepSeek-Coder-V2 | July 2024 | 236 billion parameters; 128K context window |
| DeepSeek-V3 | December 26, 2024 | 671B parameters (37B active per token); trained on 14.8 trillion tokens; 2,048 H800 GPUs; ~$5.6 million training cost |
| DeepSeek-R1 | January 20, 2025 | Reasoning model rivaling OpenAI o1; MIT License; open weights |
| DeepSeek-V3-0324 | March 2025 | Improved post-training with reinforcement learning techniques from R1 |
| DeepSeek-R1-0528 | May 2025 | Upgraded reasoning capabilities with more compute and advanced post-training |
DeepSeek-V3, released on December 26, 2024, was a landmark model that attracted worldwide attention. It is a large Mixture of Experts (MoE) model containing 671 billion total parameters, of which 37 billion are activated for any given input token. The model was pre-trained on 14.8 trillion tokens of diverse, high-quality data.
What made DeepSeek-V3 remarkable was its training efficiency. According to the technical report, the model was trained on a cluster of 2,048 Nvidia H800 GPUs (a variant of the H100 designed for the Chinese market to comply with U.S. export restrictions). The pre-training stage was completed in less than two months, consuming 2,664,000 GPU hours. Including context length extension (119,000 GPU hours) and post-training (5,000 GPU hours), the total training cost came to approximately 2,788,000 GPU hours. At an estimated rental price of $2 per GPU hour, this translates to approximately $5.576 million.
This figure stood in stark contrast to the training costs reported for comparable Western models. OpenAI's GPT-4 was estimated to have cost over $100 million to train, and Meta's Llama 3 reportedly cost around $60 million. DeepSeek-V3 achieved competitive benchmark performance at a fraction of these costs through several technical innovations, including mixed-precision training using FP8 instead of BF16 and efficient MoE routing strategies.
It is worth noting that the $5.6 million figure covers only the final training run of DeepSeek-V3 and does not include costs associated with earlier research, architectural experiments, ablation studies, or the acquisition of GPU hardware.
On January 20, 2025, DeepSeek released DeepSeek-R1, a reasoning model built on the V3 base. R1 was the first open-source model to match the performance of OpenAI's o1 across a range of core reasoning tasks. On the American Invitational Mathematics Examination (AIME) benchmark, R1 achieved approximately 79.8% pass@1 accuracy. On the MATH-500 dataset, it scored approximately 97.3% pass@1. In coding benchmarks, the model achieved a 2,029 Elo rating on Codeforces-style programming challenges.
R1 was released under the permissive MIT License, meaning that anyone could download, modify, and deploy the model freely. This decision reflected Liang's philosophical commitment to open-source development. The DeepSeek chatbot app, powered by R1, quickly rose to the top of Apple's App Store charts in the United States and more than 140 other markets. According to data from Sensor Tower, the app was downloaded 16 million times in its first 18 days, surpassing the 9 million downloads that OpenAI's ChatGPT app recorded in the same timeframe after its launch.
In late February 2025, DeepSeek held its inaugural Open Source Week, releasing five infrastructure tools over five consecutive days. These included FlashMLA (an efficient Multi-head Latent Attention decoding kernel optimized for Hopper GPUs), DeepEP (a communication library for Mixture of Experts models), DeepGEMM (an FP8 matrix multiplication kernel), DualPipe (a bidirectional pipeline parallelism optimizer), and the Fire-Flyer File System (a distributed storage system achieving 6.6 TiB/s throughput). The releases demonstrated that DeepSeek's innovations extended beyond model architecture into the full AI infrastructure stack, from GPU kernels to distributed file systems.
The combined impact of DeepSeek-V3 (December 2024) and DeepSeek-R1 (January 2025) triggered one of the most dramatic market disruptions in the history of the technology sector.
On Monday, January 27, 2025, one week after R1's release, a massive sell-off swept through U.S. technology stocks. Nvidia shares plummeted approximately 17%, erasing roughly $589 billion in market capitalization in a single trading day. This represented the largest single-day loss for any company in U.S. stock market history. Other semiconductor and technology companies were hit as well: Broadcom fell over 17%, Micron dropped nearly 12%, and Advanced Micro Devices declined more than 6%.
The sell-off was driven by a fundamental reassessment of the economics underlying the AI industry. If a Chinese startup could train a frontier-class model for approximately $5.6 million using export-restricted chips, investors questioned whether the massive capital expenditures planned by U.S. hyperscalers and chip companies were truly necessary. The event challenged the prevailing narrative that achieving cutting-edge AI performance required billions of dollars in compute spending.
Venture capitalist Marc Andreessen declared on social media that "Deepseek R1 is AI's Sputnik moment," drawing a parallel to the Soviet Union's 1957 satellite launch that shocked the United States and reshaped the Cold War space race. He also called the breakthrough a "profound gift to the world." Nvidia itself responded with a measured statement calling R1 "an excellent AI advancement."
Notably, the affected companies eventually recovered. By October 2025, Nvidia became the first company to reach a $5 trillion market valuation. Broadcom's shares rose 49% over the course of 2025, and ASML's stock increased 36%. The recovery suggested that investors ultimately concluded that AI demand would remain strong, even if the economics of model training were changing.
DeepSeek's achievements are closely tied to the broader context of U.S.-China technology competition and semiconductor export controls.
On October 7, 2022, the U.S. Department of Commerce's Bureau of Industry and Security imposed sweeping export controls on advanced semiconductors destined for China. These restrictions targeted chips exceeding certain performance thresholds in processing power and interconnect speed, effectively barring the export of Nvidia's A100 and H100 GPUs to Chinese entities.
In response, Nvidia developed the H800, a modified version of the H100 with reduced interconnect bandwidth, specifically designed to comply with the export rules while still being sellable in China. DeepSeek used 2,048 H800 GPUs to train its V3 model. However, the U.S. government further tightened restrictions in October 2023, banning the H800 as well and introducing the more restricted H20 as the only Nvidia chip available for the Chinese market.
The semiconductor consulting firm SemiAnalysis, citing anonymous industry sources, estimated that DeepSeek (through High-Flyer) possessed a total of approximately 50,000 Hopper-generation GPUs, including at least 10,000 H100s, 10,000 H800s, 30,000 H20s, and 10,000 A100s. While the A100s were purchased before the October 2022 export ban, the reported presence of H100 chips (which were never legally exportable to China) has raised questions. Nvidia has called smuggling allegations "far-fetched," and no definitive public evidence has confirmed how or whether DeepSeek obtained H100 chips.
Regardless of the exact hardware composition, DeepSeek's ability to achieve frontier-level performance under hardware constraints demonstrated that efficient algorithms and software optimization could partially compensate for limited access to the most advanced chips. This finding had significant implications for U.S. export control policy, suggesting that chip restrictions alone might not be sufficient to maintain a decisive American lead in AI capabilities.
Liang Wenfeng's approach to AI development differs sharply from the venture-capital-driven model prevalent in Silicon Valley. He has repeatedly stated that DeepSeek's goal is not short-term profitability but rather the long-term pursuit of AGI. In a 2024 interview with 36Kr's Anyong sub-brand (one of only two interviews he has granted), Liang explained his rationale:
"If you must find a commercial reason, it might not exist because it's not worth it. From a business perspective, basic research has a very low return on investment."
He argued that for decades, Chinese technology companies had been accustomed to leveraging innovations developed elsewhere and monetizing them through applications. In his view, this approach was unsustainable. He stated that China must transition "from being a beneficiary to a contributor, rather than continuing to ride on coattails." He noted that Chinese companies "barely participated in core tech innovation" over 30 years of the IT revolution and that DeepSeek aimed to change that pattern.
Liang has also been candid about the gap between Chinese and American AI capabilities. He acknowledged that China's best models "require twice the compute power" to match top global models, attributing this to structural gaps in model architecture and training efficiency rather than simply hardware access. He has argued that the most enduringly profitable companies in the United States are technology giants built on long-term research and development, not short-term profit optimization.
A defining feature of Liang's leadership is his commitment to releasing DeepSeek's models as open source. He has argued that in disruptive technologies, closed-source approaches serve only to delay progress temporarily. By releasing models under the MIT License, DeepSeek allows researchers, developers, and companies worldwide to build on its work. This philosophy also serves a strategic purpose: open-source releases generate goodwill, attract talent, and establish DeepSeek as a global leader in AI research.
Liang has stated that "both AI and API services should be affordable and accessible to everyone," and that DeepSeek's pricing philosophy aims "neither to sell at a loss nor to seek excessive profits." The company's API pricing has consistently undercut Western competitors, further reinforcing its position as a cost-effective alternative.
DeepSeek's internal culture reflects Liang's research-oriented mindset. He has described the organization as "entirely bottom-up," noting that the company generally does not predefine roles and instead allows the division of labor to emerge organically. He has emphasized that "innovation requires minimal intervention and management. It needs space to experiment and the freedom to make mistakes. True innovation often emerges spontaneously; it cannot be forced or planned."
Many of DeepSeek's team members reportedly have unconventional academic backgrounds, and Liang has noted that "their desire to do research often comes before making money." This culture has drawn comparisons to academic research laboratories rather than typical technology startups. The flat organizational structure and research-first ethos have been credited as key factors in DeepSeek's ability to produce innovative work with a fraction of the headcount found at larger AI labs.
Despite leading one of the most consequential AI companies in the world, Liang Wenfeng has maintained a notably low public profile. Between 2023 and early 2025, he granted interviews only to Anyong (a sub-brand of China's 36Kr technology media outlet). He has rarely appeared at industry conferences or given public speeches. This reticence stands in contrast to the high visibility of Western AI leaders such as Sam Altman of OpenAI or Elon Musk of xAI. Colleagues describe him as someone who prefers to let the technical work speak for itself rather than seeking media attention.
On January 20, 2025, the same day that DeepSeek released its R1 model, Liang Wenfeng attended a symposium in Beijing hosted by Chinese Premier Li Qiang. The meeting brought together a select group of experts from the fields of technology, education, science, culture, health, and sports to provide opinions and suggestions on a draft government work report that the Premier would later deliver to the National People's Congress in March. Premier Li called on the economy's "new growth drivers" created through scientific and technological innovation to help "secure and improve people's livelihoods."
Liang's invitation to this high-level government meeting underscored DeepSeek's rising prominence within China's national technology strategy. It positioned the company at the heart of the Chinese government's vision for an economic recovery driven by high-tech innovation, particularly in the context of intensifying U.S.-China competition over AI supremacy. The timing of the invitation, coinciding with R1's release, was widely interpreted as a signal of Beijing's endorsement of DeepSeek's approach to AI development.
Following the global attention generated by DeepSeek's R1 release, Liang Wenfeng's hometown of Wuchuan in Guangdong province experienced an unexpected surge in tourism. Local media reported that the government of his home village renovated roads and public facilities in response to the influx of visitors. When Liang returned to his hometown for the Spring Festival (Chinese New Year) in early 2025, he received what media described as a "hero's welcome."
Liang has been called "the pride of his hometown" by local and national media outlets. His story, rising from a rural village to global AI prominence, resonated widely across China as an example of how talent and determination can overcome modest origins. The narrative also aligned with the Chinese government's broader emphasis on self-reliance in advanced technology as a path to national strength.
Estimates of Liang Wenfeng's net worth vary significantly depending on the source and valuation methodology. Forbes estimated his net worth at approximately $11.5 billion as of 2025, based on his ownership stakes in both DeepSeek and High-Flyer. Forbes reported that Liang holds approximately 84% of DeepSeek and at least 76% of High-Flyer. The Hurun Global Rich List placed his net worth at approximately $4.6 billion for 2025. Bloomberg reported that DeepSeek's rising profile could make Liang one of the world's richest people, depending on future valuations of the company.
In 2025, Liang debuted on China's 100 richest list for the first time.
Liang Wenfeng is known for being exceptionally private about his personal life. Public information about his family, hobbies, or interests outside of AI research and quantitative finance is extremely limited. What is known about his character comes primarily from former colleagues and the two interviews he granted to 36Kr's Anyong: he is described as deeply curious, intellectually driven, and more interested in solving technical problems than in accumulating wealth or public recognition.