Sora
Last reviewed
May 9, 2026
Sources
No citations yet
Review status
Needs citations
Revision
v6 ยท 6,581 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
May 9, 2026
Sources
No citations yet
Review status
Needs citations
Revision
v6 ยท 6,581 words
Add missing citations, update stale details, or suggest a clearer explanation.
Sora was a generative AI video model developed by OpenAI that created videos from text prompts, images, and existing video clips. First previewed as a research demonstration on February 15, 2024, it launched publicly on December 9, 2024, as part of OpenAI's "12 Days of Shipmas" event. A major successor, Sora 2, followed on September 30, 2025, with native audio generation and a dedicated social iOS app. The model used a diffusion transformer architecture that operated on "spacetime patches" of video latent codes, enabling it to generate videos of varying resolutions, durations, and aspect ratios. After roughly seven months of public consumer availability, OpenAI announced on March 24, 2026, that it was discontinuing the Sora app; the web and mobile experiences shut down on April 26, 2026, with the developer API scheduled to end on September 24, 2026 [27][28].
Sora was led by OpenAI's World Simulation Team, headed by Aditya Ramesh, with co-leads Tim Brooks and Bill Peebles. Brooks departed for Google DeepMind in October 2024, and Peebles left OpenAI in April 2026 shortly after the shutdown announcement [29][30]. The model's hyperreal output prompted reactions ranging from Tyler Perry pausing an $800 million Atlanta studio expansion to a wave of public concern about deepfakes, copyright, and the future of professional video work.
On February 15, 2024, OpenAI published a technical report titled "Video generation models as world simulators" alongside several sample clips. These included an SUV driving down a mountain road, an animated "short fluffy monster" standing next to a candle, two people walking through snowy Tokyo, and synthetic historical footage of the California gold rush [1]. At this stage, Sora was not available to the general public. OpenAI granted access to a small group of red teamers and visual artists in over 60 countries to test the system, identify safety weaknesses, and provide creative feedback [2].
The February 2024 preview was widely compared to a "GPT-1 moment" for video, marking the first time that behaviors like object permanence seemed to emerge naturally from scaling up pre-training compute for video generation [3]. OpenAI's largest released model could generate up to one minute of high-fidelity video at 1080p resolution. Sora's announcement coincided with Google's Gemini 1.5 reveal earlier the same day, leading some commentators to describe February 15 as one of the most consequential single days in generative AI history.
In the weeks following the announcement, OpenAI CTO Mira Murati gave a Wall Street Journal interview in which she struggled to answer basic questions about Sora's training data. Asked by reporter Joanna Stern whether the model had been trained on YouTube, Instagram, or Facebook video, Murati said she was "not sure," before falling back on the phrase "publicly available and licensed data" [31]. The interview drew widespread criticism and intensified scrutiny of how OpenAI sourced training video, an issue that would resurface throughout Sora's life.
OpenAI released Sora to the public on December 9, 2024, during its "12 Days of OpenAI" (informally called "12 Days of Shipmas") event, a 12-day live-stream series that ran from December 5 onward, unveiling a new product or feature each weekday [4]. The version launched publicly was called Sora Turbo, a significantly faster and more capable iteration of the model shown in February.
Sora Turbo brought several improvements over the research preview:
At launch, Sora was available to ChatGPT Plus and Pro subscribers in most regions where ChatGPT operated. The United Kingdom, Switzerland, and the European Economic Area were excluded from the initial rollout because of regulatory uncertainty around the EU AI Act [6].
Sora Turbo was hosted on a dedicated web property at sora.com rather than inside the ChatGPT interface. The Plus tier capped output at 720p and 5 seconds per clip, while the Pro tier supported 1080p and clips up to 20 seconds. Both tiers could combine clips using the storyboard tool to produce longer composed sequences.
OpenAI announced Sora 2 on September 30, 2025, alongside a dedicated iOS app and plans for an Android version (which arrived about two months later, on November 5, 2025) [3][32]. CEO Sam Altman called it a "ChatGPT for creativity" moment in his launch post on X. Sora 2 represented a large step forward in several areas:
The Sora app also functioned as a TikTok-style social platform, with a vertical feed of 10-second clips, like and comment buttons, remix tools, and user profiles. It was rolled out invite-only at launch, with each existing user receiving codes to share with friends [34]. Within five days the app had crossed one million downloads, faster than ChatGPT had reached the same milestone, and reached the top of Apple's US App Store free-app chart. By the end of October it had been downloaded approximately 2.7 million times on iOS [34].
On October 14, 2025, OpenAI rolled out Sora 2 Pro, an experimental higher-quality variant available to ChatGPT Pro subscribers. Sora 2 Pro generated 1080p output at up to 25 seconds in length, with sharper detail, cleaner text rendering, and more reliable audio synchronization than the standard Sora 2 [35].
As of March 13, 2026, Sora 1 was no longer available in the United States; the app opened in Sora 2 by default for US users [8]. Starting January 10, 2026, OpenAI removed free-tier access to video and image generation in Sora, restricting it to Plus and Pro subscribers only [9].
The standalone Sora app experienced a sharp drop in engagement after its initial launch excitement. App installs fell 32% in December 2025 and another 45% in January 2026, dropping to 1.2 million installs that month, while consumer spending fell 32% over the same period. On the US App Store, Sora fell out of the Top 100 free apps. Third-party measurement firms reported 30-day retention of approximately 1%, an industry-low figure that analysts attributed to the novelty wearing off and to OpenAI tightening intellectual-property restrictions [10][36].
In response, OpenAI signaled plans to integrate Sora's video generation capabilities directly into ChatGPT. The Information reported this plan on March 11, 2026, noting that the move aimed to reach a broader user base and push toward OpenAI's goal of 1 billion weekly active users [11]. Under this integration, a user could ask ChatGPT to write a script and then immediately generate a video trailer based on the output, all within the same conversation.
Several incremental features were added throughout late 2025 and early 2026:
On December 11, 2025, The Walt Disney Company announced a $1 billion equity investment in OpenAI, accompanied by a three-year licensing agreement that made Disney the first major content licensing partner on the Sora platform [23]. Under the deal, Sora users would gain access to more than 200 animated, masked, and creature characters from Disney, Pixar, Marvel, and Star Wars, including costumes, props, vehicles, and iconic environments. Available characters were to include Mickey Mouse, Minnie Mouse, Lilo, Stitch, Ariel, Belle, Cinderella, Black Panther, Darth Vader, and Yoda [24]. The agreement explicitly excluded any talent likenesses or voices.
Beyond the Sora licensing, Disney also became a major enterprise customer of OpenAI, using its APIs to build internal tools and experiences for Disney+, and deploying ChatGPT for its employees. A selection of fan-inspired Sora short-form videos became available to stream on Disney+. The character licensing on Sora and ChatGPT Images was expected to go live in early 2026 [23][24].
The partnership effectively dissolved when Sora itself was wound down. According to TechCrunch, Disney was informed of OpenAI's decision to shutter the Sora app less than an hour before the public announcement on March 24, 2026 [37].
On March 24, 2026, OpenAI announced via X that it would discontinue the Sora app. The web and iOS or Android experiences shut down on April 26, 2026, while the developer API was scheduled to remain available until September 24, 2026 [27][28]. OpenAI urged users to download any saved generations before the data deletion deadlines.
The company did not give a single official reason in its announcement, but reporting from the Wall Street Journal, TechCrunch, CNN, and others converged on a familiar set of factors:
On April 17, 2026, less than a month after the shutdown announcement and roughly nine days before the app went dark, three senior OpenAI executives announced their departures on the same day. Sora head Bill Peebles, chief product officer Kevin Weil, and enterprise CTO Srinivas Narayanan all confirmed their exits via X [29][39]. Peebles wrote that he was "proud of all the sleepless nights before and after the launch this team endured in order to deploy the technology in a responsible way and help steer societal norms." Weil had previously led the OpenAI for Science research initiative, which OpenAI also folded into other teams during the same retrenchment [39].
Sora was a diffusion model built on a transformer backbone, a design known as a diffusion transformer or DiT. The core pipeline consisted of three stages: a video compressor (encoder), the transformer-based denoiser, and a video decompressor (decoder) [12]. The architecture extended the DiT design from William Peebles and Saining Xie's ICCV 2023 paper "Scalable Diffusion Models with Transformers," which had previously been used for image generation, into the spatiotemporal domain [40].
Sora used a spatiotemporal autoencoder trained from scratch to compress raw video into a lower-dimensional latent space. This compression reduced both spatial resolution and temporal length, meaning a one-minute video became a much shorter sequence of latent frames. The compression step was what enabled Sora to handle long-duration video generation without an unmanageable number of tokens [1].
The compressed video was then decomposed into "spacetime patches," three-dimensional chunks that spanned portions of both the spatial frame and the temporal sequence. These patches served as the equivalent of tokens in a large language model: the transformer processed them as a sequence. Because patches could be extracted from videos of any resolution, duration, or aspect ratio, the same architecture handled a wide variety of input and output formats without requiring fixed dimensions [1][12].
OpenAI's technical report drew an explicit analogy: just as text tokens represent word fragments that can be assembled into any sentence, spacetime patches represent "visual phrases" that can be assembled into any video [1].
Video generation began with a latent representation filled with random noise. Over many denoising steps, the transformer predicted and removed noise to reveal the final video. The model was trained to predict the original "clean" patches from their noisy versions, conditioned on a text prompt that had been processed by a text encoder (similar to those used in DALL-E 3). The result was decoded back into pixel space by the video decompressor [12]. The denoising procedure built on the work of Jonathan Ho, Ajay Jain, and Pieter Abbeel on "Denoising Diffusion Probabilistic Models" (NeurIPS 2020), an architectural lineage cited explicitly in OpenAI's technical report [1].
Like DALL-E 3, Sora used an internal video captioning model to generate detailed text descriptions for each training clip. These long, dense captions, much richer than the alt text or human-written descriptions found on the open web, helped the model learn fine-grained correspondences between language and visuals. At inference time, GPT-4 expanded short user prompts into the longer captioning style the model had been trained against [1].
The technical report demonstrated that Sora's output quality improved smoothly with additional training compute, mirroring the scaling laws observed in large language models. At small compute, output was blurry and physically incoherent; at large compute, the model produced minute-long clips with consistent characters and convincing camera motion. OpenAI argued that this trajectory suggested video models could become "general-purpose simulators of the physical world" given continued scaling [1].
Sora 2 retained the diffusion-transformer foundation but added native audio generation, longer maximum durations, and significantly improved physics modeling. The Sora 2 system card describes a unified audio-video model that conditions both modalities on the same latent representation, allowing dialogue lip sync and ambient sound effects to emerge from a single forward pass [41]. OpenAI did not publish parameter counts for either Sora or Sora 2.
According to OpenAI's system card, Sora was trained on a combination of three data sources [2]:
| Data Source | Description |
|---|---|
| Publicly available data | Collected from industry-standard machine learning datasets and web crawls |
| Proprietary partnership data | Licensed content from partners such as Shutterstock and Pond5 |
| Human feedback data | Input from AI trainers, red teamers, and employees |
Before training, all datasets went through a filtering process that removed explicit, violent, or otherwise sensitive material, extending the filtering methods developed for DALL-E 2 and DALL-E 3 [2]. OpenAI did not disclose specific video sources or the total dataset size, leaving open questions about whether YouTube content was used; CTO Mira Murati's evasive answers in the WSJ interview in March 2024 fueled speculation that some scraped video material may have come from major video platforms [31].
Sora's capabilities expanded across its versions. The table below compares the original Sora research preview, Sora Turbo, Sora 2, and Sora 2 Pro.
| Feature | Sora (Feb 2024) | Sora Turbo (Dec 2024) | Sora 2 (Sep 2025) | Sora 2 Pro (Oct 2025) |
|---|---|---|---|---|
| Maximum resolution | 1080p | 1080p (Pro tier) | 720p (web/Plus) | 1080p |
| Maximum duration | 60 seconds | 20 seconds | 15 seconds | 25 seconds |
| Aspect ratios | Widescreen, vertical, square | Widescreen, vertical, square | Widescreen, vertical, square | Widescreen, vertical, square |
| Audio generation | No | No | Yes (dialogue, SFX, ambient) | Yes (dialogue, SFX, ambient) |
| Cameo feature | No | No | Yes | Yes |
| Input types | Text, image, video | Text, image, video | Text, image | Text, image |
| Storyboard tool | No | Yes | Yes | Yes |
| Social feed | No | Yes (Explore) | Yes (TikTok-style) | Yes |
| Video extensions | No | No | Yes | Yes |
| Licensed character library | No | No | Yes (announced Disney) | Yes (announced Disney) |
| Watermarks | Visible | Visible (removable for Pro) | Visible moving | Visible moving (removable) |
Beyond raw specifications, Sora demonstrated several emergent properties that OpenAI documented in its technical report [1]:
Despite its capabilities, Sora had known shortcomings that OpenAI publicly acknowledged in its technical report and system card.
The original Sora model frequently failed to simulate complex physical dynamics correctly. A cookie might show no bite mark after a character takes a bite; a glass might not shatter when dropped; smoke could move in physically impossible patterns [1]. While Sora 2 improved physics accuracy, errors still occurred in scenes involving multiple interacting forces. Users found that combining several physical actions in a single prompt (such as pouring water while stirring a spoon) increased the likelihood of artifacts [13].
Sora sometimes confused spatial directions, mixing up left and right or failing to follow precise positional descriptions in the prompt. This limitation affected tasks requiring exact object placement or character orientation within a scene [1].
Early versions of Sora produced particularly poor results for gymnastics, generating "strange shape-shifting humans that vault through the air and sometimes land on three legs or an extra head" [14]. While Sora 2 specifically highlighted improved gymnastics rendering as a benchmark, complex human motion remained an area where errors could appear, especially in close-up sequences with rapid limb movement.
Like most diffusion models, Sora struggled to render coherent on-screen text such as signs, captions, or product labels. Sora 2 Pro improved this somewhat by raising spatial resolution, but readable embedded text remained inconsistent.
Maintaining narrative and visual consistency across longer video durations was challenging. While Sora 2 improved multi-shot controllability, subtle inconsistencies in character appearance, clothing, or background details could still emerge over extended sequences. Most professional users limited individual clips to 5 to 10 seconds and stitched several together using the storyboard tool.
OpenAI conceded that Sora often broke down on cause-and-effect chains. Effects sometimes occurred before their causes, and characters could respond to events that had not yet been depicted, suggesting the model treated time as a dimension to fill rather than as a strict ordering [1].
Sora-generated videos included a visible, moving digital watermark to signal AI-generated content. However, within a week of Sora 2's release, third-party programs appeared that could remove the watermark, undermining this safety measure [25][42].
Sora was bundled with OpenAI's ChatGPT subscription tiers rather than sold as a standalone product. The pricing structure evolved over time. Through early 2026 the tiers were as follows [9][15]:
| Feature | ChatGPT Plus ($20/month) | ChatGPT Pro ($200/month) |
|---|---|---|
| Monthly credits | 1,000 | 10,000 |
| Priority videos (approx.) | ~50 | ~500 |
| Maximum resolution | 720p | 1080p |
| Maximum video length | 5 seconds (Sora Turbo) / 15 seconds (Sora 2) | 20 seconds (Sora Turbo) / 25 seconds (Sora 2 Pro) |
| Watermarks | Yes | Removable |
| Relaxed mode (unlimited) | No | Yes |
| Sora 2 Pro access | No | Yes |
Pro subscribers also had access to an unlimited "relaxed" generation mode, where videos were queued at lower priority and processed during off-peak hours at no credit cost [15].
Free-tier users lost access to Sora's generation features on January 10, 2026 [9]. With the April 26, 2026 shutdown, all consumer access ended.
For developers, OpenAI offered API access to Sora 2 with pricing based on model tier and output resolution [16]:
| Model | Resolution | Price per second |
|---|---|---|
| Sora 2 | 720p | $0.10 |
| Sora 2 Pro | 720p | $0.30 |
| Sora 2 Pro | 1024 vertical or 1792x1024 | $0.50 |
A 10-second standard video at 720p cost roughly $1.00, while a 10-second Pro HD clip ran approximately $5.00. Developers needed at minimum a $10 API credit top-up (Tier 2) to unlock Sora model access. Rate limits scaled with tier: Plus subscribers got 5 requests per minute, Pro users got 50 requests per minute, and Enterprise accounts could negotiate 200 or more requests per minute with dedicated support [16].
The Sora API supported several endpoints, including reusable character references, video extensions, generations up to 20 seconds, 1080p output for the sora-2-pro model, and Batch API support. OpenAI also added a POST /v1/videos/edits endpoint for editing existing videos [22].
With the April 26, 2026 app shutdown, only the API remained available. OpenAI announced that the API itself would be discontinued on September 24, 2026, after which Sora would no longer be accessible through any OpenAI surface [27][28].
OpenAI's safety approach for Sora built on methods developed for DALL-E and ChatGPT [2].
Before the December 2024 launch, OpenAI engaged external red teamers in nine countries to probe the system for vulnerabilities and safety gaps. The company also worked with hundreds of visual artists, designers, and filmmakers from over 60 countries after the February 2024 announcement [2]. Red teamers ran more than 15,000 generations between September and December 2024, the results of which fed directly into the moderation classifiers shipped at launch [41].
OpenAI built an internal search tool that used technical attributes of generated videos to help verify whether a piece of content came from Sora. This assisted in tracking misuse and responding to reports of harmful content [5].
Despite these safeguards, security researchers found weaknesses. Reality Defender, a company specializing in identifying deepfakes, reported that it was able to bypass Sora's anti-impersonation safeguards within 24 hours of the Sora 2 launch. A Washington Post journalist demonstrated that the face-sharing feature could be exploited: simply granting the app permission to share one's face with chosen contacts allowed those contacts to create videos of the person being arrested or engaging in fabricated scenarios, all without further approval [26].
In November 2024, shortly before the public launch, a group of artists who had been granted early access to Sora leaked access to the model on Hugging Face. They published a manifesto accusing OpenAI of "art washing," claiming that the company used them as "PR puppets" to lend artistic credibility to a product that they believed threatened their livelihoods, all without compensation [18]. OpenAI revoked the leaked access within roughly three hours and stated that "hundreds of artists" had shaped Sora's development through voluntary participation.
Sora's training data drew sustained legal scrutiny. A coalition of Japanese entertainment companies, including Studio Ghibli, Bandai Namco, and Square Enix, accused OpenAI of using copyrighted animation and design styles without permission. Japan's Content Overseas Distribution Association argued that OpenAI's "opt-out" system for rights holders improperly reversed the burden of consent, urging the company to stop using Japanese works until a legal framework was in place [17]. The Motion Picture Association in the United States issued a similar complaint on October 6, 2025, criticizing the opt-out approach for treating copyrighted material as default-permitted [27].
On the user-generated content side, the Sora app initially faced problems with users creating videos featuring copyrighted characters like SpongeBob and Pikachu. OpenAI shifted from an opt-out to an opt-in model for intellectual property and increased content restrictions, though this change contributed to declining user engagement [10].
The Mira Murati WSJ interview from March 2024, in which the OpenAI CTO appeared unable or unwilling to confirm whether YouTube videos had been used in training, became a frequently cited piece of evidence in legal filings and journalism about the model's data provenance [31].
The announcement of Sora prompted a strong response from parts of the entertainment industry. Filmmaker Tyler Perry announced he would pause a planned $800 million expansion of his Atlanta studio, citing concerns about the potential impact of AI video generation tools like Sora on traditional filmmaking. Perry said pilots that traditionally cost $15 million to $35 million could in the future be produced "at a fraction of the cost," warning that "a lot of jobs are going to be lost" [43].
Major talent agencies also took protective action: Creative Artists Agency and United Talent Agency opted their clients out of Sora 2. United Talent Agency described the app as "exploitation, not innovation," while Creative Artists Agency warned that it "exposes our clients and their intellectual property to significant risk" [26].
The Sora 2 launch in September 2025 immediately triggered deepfake concerns. Unauthorized AI-generated clips using actor Bryan Cranston's voice and likeness appeared on the platform, including a viral clip of Cranston taking a selfie with Michael Jackson. Under pressure from Cranston and the SAG-AFTRA actors' union, OpenAI updated its policy on October 20, 2025, to require opt-in consent before any person's likeness could be used and to give rights-holders "more granular control" over generations involving their clients [17]. Families of Robin Williams, George Carlin, Martin Luther King Jr., Kobe Bryant, and Paul Walker also complained to OpenAI about the misuse of their loved ones' likenesses on the platform [26][44].
Public Citizen, a US consumer advocacy group, called on OpenAI to suspend Sora 2 in November 2025, warning that its realistic video output could be weaponized for political deepfakes or non-consensual imagery [19]. A separate controversy emerged around a "dead celebrity loophole": since posthumous likeness rights vary widely across jurisdictions, families of deceased public figures had limited legal recourse. OpenAI blocked videos of Martin Luther King Jr. on the platform after users created what the company called "disrespectful depictions" [17].
Broader concerns about disinformation emerged rapidly after the Sora 2 launch. The Sora app saw AI-generated videos depicting ballot fraud, immigration arrests, protests, and fabricated crime scenes appear on its social feed within days of release [26]. UC Berkeley's School of Information warned that society was "unprepared for the next wave of increasingly realistic, personalized deepfakes" [26]. The most-viewed Sora 2 clip in the first week of launch was reportedly a parody video depicting CEO Sam Altman shoplifting GPU cards from a Target store, which Altman himself later acknowledged on X [45].
The South Park episode "Sora Not Sorry," the third episode of season 28, aired on November 12, 2025, and satirized AI deepfakes and copyright issues by showing schoolchildren weaponizing Sora 2 against each other. Online creators colloquially nicknamed the Sora app's video stream "SlopTok," reflecting concerns that the platform was promoting low-effort, novelty-driven content rather than substantive creative work [45].
During Sora's roughly two years of consumer availability, several films, advertisements, and music videos were produced using the model.
| Project | Type | Date | Details |
|---|---|---|---|
| Air Head (Shy Kids) | Short film | February 2024 | One of the original early-access showcase pieces from Toronto-based collective Shy Kids; widely shown at film festivals |
| Worldweight (August Kamp) | Music video | March 2024 | Two-minute 19-second video to a mellow electronic song; one of the first music releases tied to Sora |
| Toys R Us "The Origin of Toys R Us" | Brand film | June 2024 | 66-second commercial premiered at the Cannes Lions Festival; produced with creative agency Native Foreign and depicted a young Charles Lazarus and the Geoffrey the Giraffe mascot [46] |
| Abstract / The Golden Record (Paul Trillo) | Music video | May 2024 | LA director Paul Trillo's first official commissioned music video using Sora, stitched from 55 separate clips [47] |
| Balenciaga "Escape from Neusman" | Fashion film | Fall 2024 | A series of 30-second commercials for the brand's Fall 2024 collection using Sora outputs |
| Various filmmaker showcases | Short films | March 2024 | OpenAI commissioned six artists, including Walter Woodman, Don Allen Stevenson III, Nik Kleverov, and others, to produce short films highlighted in MIT Technology Review [48] |
OpenAI also pitched Sora to Hollywood production companies starting in early 2025, holding meetings with major studios and creative agencies. Those efforts cooled after the talent-agency opt-outs in October 2025 and the Sora 2 deepfake controversy.
Sora operated in a rapidly expanding market for AI video generation. The major competitors as of mid-2026 included:
| Model | Developer | Key features |
|---|---|---|
| Veo 3 / 3.1 | Google (DeepMind) | Native 4K output, character consistency, vertical video, native audio; Ingredients to Video for object consistency, Frames to Video for transitions, Insert/Remove Object with automatic lighting; available through Gemini Advanced |
| Movie Gen | Meta | 30B-parameter model, 16-second videos at 1080p, personalized video from a single photo, synchronized audio up to 45 seconds; announced October 2024 |
| Runway Gen-4.5 | Runway | High visual quality, cinematic focus, realistic physics, widely used in professional post-production; partnerships with Lionsgate; no native audio as of mid-2026; $95/month Unlimited tier |
| Pika 2.1 Turbo | Pika Labs | Fast generation (30 to 90 seconds), creative effects and style transfer tools, social-oriented; $28/month Pro tier |
| Kling 3.0 | Kuaishou | Native 4K, multi-shot sequences with subject consistency, simultaneous audio-visual generation, strong at complex movements |
| Seedance 2.0 | ByteDance | Unified audio-video architecture for natural reverb and proximity effects; competitive on quality benchmarks |
| Hailuo 02 / 2.3 | MiniMax | 1080p at 24 to 30 FPS, ranked second globally on Artificial Analysis benchmark as of early 2026 |
| Hunyuan Video | Tencent | Open-weights release in late 2024, 13B parameters, supports text-to-video and image-to-video |
| Wan 2.1 / 2.6 | Alibaba | Open-source Chinese model, released early 2025, integrated into the ModelScope ecosystem |
On the public Artificial Analysis text-to-video leaderboard, Sora 2 Pro consistently ranked behind ByteDance's Seedance 2.0, Runway's Gen-4.5, and KlingAI's Kling 3.0 by mid-2026, reflecting both quality stagnation in the wake of Sora's stalled development and the rapid release cadence of competitors. Google's Veo 3.1 was widely cited as the strongest all-rounder, particularly for prompt adherence, native audio, and 4K output. Sora 2 retained advantages in human emotion rendering and physics simulation in the period before its discontinuation.
Sora's broader cultural impact outpaced its commercial trajectory. Within days of the February 2024 demonstration, the model became a centerpiece of the AI policy conversation in Washington, the United Kingdom, and Brussels, with regulators citing it in early discussions about provenance standards and synthetic-media disclosure. The MIT Technology Review described the demos as "impressive" while cautioning that they were almost certainly cherry-picked, and a 2025 Science Advances study found that generative video tools "lower barriers to entry in creative work," enabling broader participation in video production [25].
In Hollywood, the response was sharply divided. Some independent filmmakers and effects houses adopted Sora for previsualization, mood boards, and storyboarding; others, led by Tyler Perry's high-profile pause, treated it as an existential threat. The Brookings Institution warned that more than 100,000 US entertainment jobs could be at risk by 2026, citing surveys showing that 75 percent of film companies that had adopted generative AI reported reduced workloads in affected categories [25].
Despite the eventual shutdown, Sora left durable technical and commercial legacies. The diffusion-transformer-on-spacetime-patches recipe became the dominant architectural pattern for video generation, adopted in various forms by Veo, Runway, Kling, and the Wan and Hunyuan open-source families. The cameo flow Sora pioneered became a template for likeness-controlled generation across the industry, with rivals adopting similar opt-in mechanisms following the Cranston and SAG-AFTRA episode.
The name "Sora" comes from the Japanese word meaning "sky," which OpenAI chose to evoke the model's limitless creative potential [1].