PixVerse

AI Companies Chinese AI Video Generation

9 min read

Updated Jul 16, 2026

Suggest edit History Talk

RawGraph

Last edited

Jul 16, 2026

Fact-checked

In review queue

Sources

20 citations

Revision

v2 · 1,737 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

PixVerse is an AI video generation product made by the Chinese startup AISphere (Chinese: 爱诗科技, romanized Aishi Technology or Aishi Tech). It lets users create short video clips from text prompts, still images, and existing footage, and is aimed at a broad consumer audience rather than professional editors. The company describes the product as a "Canva for video generation," meaning a tool that makes video creation accessible to people without editing skills.^[1]^[2] PixVerse launched for global users in January 2024 and grew quickly on the back of viral effect templates and fast generation. AISphere also runs a domestic Chinese version branded PaiWo AI (拍我AI). In March 2026 the company raised a Series C of about 300 million US dollars and crossed a one billion dollar valuation, the largest single funding round recorded in Asia's AI video sector at the time.^[3]^[4]

AISphere and Wang Changhu

AISphere was founded in April 2023 by Wang Changhu (王长虎), who serves as chief executive. Wang holds a degree from the University of Science and Technology of China and was a senior researcher at Microsoft Research Asia before joining ByteDance, where he led the visual technology team and worked on the company's AI lab.^[1]^[5] His background in computer vision shaped AISphere's focus on building its own video generation models rather than reselling third-party systems. Jaden Xie is named as a co-founder of the company.^[2] The company is based in Beijing and in 2026 opened an international office in Singapore to support overseas expansion.^[4]^[6]

Wang has publicly argued that video generation has commercial potential on the order of large language models, a view he restated at the 2025 Beijing Zhiyuan Conference.^[7] AISphere positions itself as a consumer product company that also develops foundation models, with PixVerse serving the global market and PaiWo AI serving China.

Product and versions

PixVerse generates short clips, typically up to a handful of seconds per shot, from text prompts (a capability often called text to video) or from an uploaded image (image to video). The underlying models are built on a diffusion plus Transformer (DiT) architecture developed in house, with multimodal feature fusion used to improve how prompts map to motion and scene structure.^[8]^[7]

The product has iterated rapidly. The original web version reached global users in January 2024, and AISphere has shipped numbered model upgrades roughly every few months since.

Version or release	Date	Notable additions
Global launch	January 2024	Public web release, text to video and image to video^[5]^[1]
V2	July 25, 2024	DiT architecture, single clip up to 8 seconds, multi-clip up to 40 seconds, consistency across segments^[8]
V3	October 2024	Quality upgrade; effect templates including the viral "Venom" transformation^[5]^[2]
Mobile app	December 2024	iOS and Android apps; 10-second clips^[9]
V4	February 2025	Faster generation (high-quality clips in about 5 seconds), added sound effects and voice^[10]
V4.5	May 15, 2025	Cinematic camera controls and a fusion system^[9]
V5	August 27, 2025	Agent creation assistant; ranked first globally for image to video on Artificial Analysis^[7]^[11]
V5.5	December 1 to 2, 2025	One-click multi-shot storyboarding with native audio (effects, dialogue, music)^[12]^[11]
V5.6	January 2026	Improved visual stability, motion, and audio-visual alignment^[13]
R1	January 14, 2026	Real-time interactive video generation up to 1080p^[14]

Reported version dates vary slightly between English and Chinese coverage, and some intermediate builds (for example a V3.5 beta in late 2024) were rolled out without a separate model release. The table lists the milestones that are confirmed across more than one source.

In January 2026 AISphere announced PixVerse R1, which it calls a real-time video generation model. R1 produces video up to 1080p that responds to user input as it plays, so a viewer can steer the direction of a clip while it is being generated rather than only editing a finished file. The company built R1 around an autoregressive framework for long sequences and an inference engine that cuts sampling to a few steps, and it released the model with API access for enterprise partners.^[14]

Features

PixVerse supports text to video and image to video as its core modes. Around these, the product has added a library of effect templates and editing tools that handle most of the work so users only supply a prompt or a photo. Generation is fast by the standards of the category: V4 and later can return short clips in roughly five seconds, and the system can produce a 1080p clip in about a minute.^[7]

Later versions added native audio, so a clip can include sound effects, dialogue, voice, and background music generated together with the picture rather than added afterward. V5.5 introduced one-click multi-shot storyboarding, where the model lays out several connected shots and the matching audio from a single prompt, with camera moves and transitions designed automatically.^[12] The V5 update added an "Agent" creation assistant that reads an uploaded image, infers what it shows, and assembles a 5 to 30 second clip without manual prompt engineering.^[7] Other editing features include character or scene replacement, remixing of existing clips, and frame-based modification.^[12]

China version (PaiWo AI)

PaiWo AI (拍我AI) is the domestic Chinese version of PixVerse, run by AISphere for users in mainland China. It launched on June 6, 2025, with both a web app and mobile apps, after the company decided the English name PixVerse was awkward for Chinese users and rebranded the local product.^[15] PaiWo AI shares the same in-house Diffusion plus Transformer technology as PixVerse and tracks the same model versions, including the V5 and V5.5 releases, supporting text to video, image to video, effect templates, first-and-last-frame transitions, cinematic camera control, and multi-character narrative generation.^[15]^[12] Splitting the product into separate global and domestic apps follows a common pattern among Chinese AI video firms, which must meet domestic content and regulatory requirements for the China market.

Funding and traction

AISphere has raised money in a rapid series of rounds. Early backing came from Ant Group, which led an A2 round of more than 100 million yuan (about 13.8 million US dollars) reported in April 2024.^[16] By December 2024 the company disclosed that A2 through A4 rounds together totaled nearly 300 million yuan (about 43 million US dollars), with investors including Ant Group, the Beijing Artificial Intelligence Industry Investment Fund, CAS Investment, and Lighthouse Capital.^[17] A later A5 round accompanied PixVerse passing 15 million monthly active users and the plan to launch domestically.^[18]

In September 2025 AISphere raised a Series B of more than 60 million US dollars led by Alibaba, with participation from Antler and the Beijing Artificial Intelligence Industry Investment Fund. It was the largest single round raised by a Chinese AI video company up to that point.^[1] A B+ round of 100 million yuan followed in October 2025, backed by Fosun RZ Capital, Tongchuang Weiye, and Shunxi Fund.^[6]

The Series C was announced on March 12, 2026, at about 300 million US dollars and lifted the company past a one billion dollar valuation, making it a unicorn. CDH Investments (鼎晖) led the round through several of its funds, with a syndicate of close to 20 institutions across Asia and beyond, including China Ruyi, 37 Interactive Entertainment, Yizhuang state capital, Guotai Junan Innovation Investment, Fosun RZ Capital, UOB Venture Management, OCBC's Lion X fund, 3W Fund, Antler, EnvisionX Capital, and iGlobe Partners. The round brought cumulative funding above 400 million US dollars and was described as the largest single AI video financing recorded in Asia.^[3]^[4]^[19]

On traction, AISphere reported that PixVerse surpassed 100 million users across about 175 countries, with 16 million monthly active users and annual recurring revenue above 40 million US dollars, after starting commercialization in November 2024.^[6]^[3] The company says its models generated over 2 billion videos by early 2026.^[2] Much of the early growth came from viral effect templates: a "Venom" transformation that turned a user's photo into the Marvel symbiote spread widely on TikTok from November 2024 and, by AISphere's account, drew billions of cumulative views.^[2]^[1] In September 2025 PixVerse appeared on Andreessen Horowitz's ranking of the top 50 consumer AI mobile apps, at number 25.^[6]

Competition

PixVerse operates in a crowded AI video market. Its closest Chinese rivals include Kling from Kuaishou, Hailuo from MiniMax, Vidu from Shengshu Technology, and Jimeng from ByteDance, the company where Wang Changhu previously worked. Internationally it competes with OpenAI's Sora and Google's Veo, among others.^[2] AISphere's differentiation rests on consumer accessibility, fast generation, a deep template library, and competitive model rankings: on the third-party Artificial Analysis leaderboard, PixVerse V5 reached first place for image to video in August 2025, and later versions remained near the top of both image-to-video and text-to-video charts.^[7]^[3] One complication for the company is that Alibaba, an early investor, also develops its own video models, so several of its backers compete with it in the same market.^[20]

References

Alibaba leads US$60 million investment in AI video generation start-up AIsphere - South China Morning Post ↩
PixVerse Joins the Ranks of Global AI Unicorns with Asia's Largest Funding Round in AI Video Generation - PixVerse ↩
爱诗科技完成3亿美元C轮融资，鼎晖领投 - QbitAI (量子位) ↩
AI video generation platform PixVerse raises Series C funding, opens global office in Singapore - TNGlobal ↩
Chinese AI Video Company AIsphere Secures Nearly $43 Million in A+ Financing Round - TMTPost ↩
爱诗科技完成B+轮1亿元融资，ARR突破4000万美金 - 36Kr ↩
全球图生视频榜单第一，爱诗科技PixVerse V5如何改变一亿用户的视频创作 - Tencent News (腾讯新闻) ↩
Aisi Technology's AIsphere Launches Video Generation Product PixVerse V2 - AIBase ↩
PixVerse Latest Launches - Product Hunt ↩
PixVerse Launches V4.0 Update: Synchronization of Sound Effects and Repainting Features - AIBase ↩
PixVerse V5.5 AI Video Generator - Runware ↩
PixVerse（拍我AI）V5.5发布：国内首款分镜+音频一键生成AI视频大模型 - QbitAI (量子位) ↩
What Is PixVerse V5.6? AI Video Generation with End Frame Control - MindStudio ↩
PixVerse Releases World's First Real-Time World Model for Interactive Video - TMTPost ↩
拍我AI：PixVerse国内版，爱诗科技推出的AI视频生成平台 - Hello123 ↩
Chinese text-to-video startup AIsphere receives $13.8 million funding from Ant Group - TechNode ↩
PixVerse Owner AISphere Bags Almost USD41 Million in Latest Fundraiser - Yicai Global ↩
独家丨爱诗科技完成A5轮融资，PixVerse月活突破1500万并将在国内上线 - Zhihu ↩
Alibaba-backed PixVerse joins unicorn rank after Series C funding - DealStreetAsia ↩
Alibaba is Bankrolling China's AI Video Race, Then Racing Against It - Recode China AI ↩

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

1 revision by 1 contributors · full history

Suggest edit

What links here

Best AI Video Generators MiniMax

AISphere and Wang Changhu

Product and versions

Features

China version (PaiWo AI)

Funding and traction

Competition

References

Improve this article

Related Articles

MiniMax

Kuaishou

Hailuo AI

CogVideoX

HunyuanVideo

Kling 2.1

What links here

Related Articles

MiniMax

Kuaishou

Hailuo AI

CogVideoX

HunyuanVideo

Kling 2.1

What links here