Runway Act-Two

Runway Act-Two is a generative motion capture and character animation model developed by Runway, publicly introduced on July 14, 2025. The model accepts a driving performance video and a separate reference image (or short clip) of a character, and returns an animation in which the character reproduces the head, face, body, and hand motion of the original performer. Act-Two extends the Act-One model that Runway released in October 2024, which had been limited to facial expression transfer. The new model added full body and hand tracking and improved generation quality, while keeping the original workflow: no rigging, no markers, no studio capture setup.

Act-Two launched first for Runway Enterprise customers and Creative Partners, and rolled out to all paid Runway plans within a few days of the announcement. The model was added to the Runway API on July 21, 2025, at a rate of 5 credits per second of generated video, with a minimum charge of 15 credits per request. Industry coverage at launch focused on the model's potential to compete with traditional optical motion capture in pre-visualization, indie animation, and certain post-production tasks, with several Hollywood publications describing it as a meaningful step toward democratized performance capture.

Overview

Act-Two belongs to a small category of generative models that drive a character animation from a recorded human performance rather than from a text prompt. The model takes two inputs. The first is a driving performance video showing a person acting, speaking, or moving. The second is a character reference, typically an image, but optionally a short video, of the target character that should be animated. The model then synthesizes a new video in which the target character moves and emotes according to the driving performance.

Unlike traditional motion capture, Act-Two does not require markers, suits, depth sensors, or a calibrated camera rig. The driving video can be recorded on a smartphone or webcam. The model handles the entire transfer pipeline internally: it analyzes head pose, facial micro-expressions, body posture, hand gestures, and (in cases where a video reference is supplied) environmental motion, and applies them to the reference character with what Runway describes as preservation of identity and style.

The model is positioned as a production aid rather than a consumer toy. The pricing, the API rollout, and the early access pattern (Enterprise first, then paid plans, then API) reflect a strategy aimed at studios, advertising agencies, game cinematic teams, and independent filmmakers who would otherwise pay for live action mocap or hire animators.

Background: Runway and the path from Act-One to Act-Two

Runway is a New York based generative AI company founded in 2018 by Cristobal Valenzuela, Alejandro Matamala, and Anastasis Germanidis. The company built its early reputation on browser based creative tools and on its co-authorship of the latent diffusion paper that underpinned Stable Diffusion. From 2022 onward, Runway focused on text to video and image to video generation, releasing Gen-1 in February 2023, Gen-2 in March 2023, Gen-3 Alpha in June 2024, and Runway Gen-4 in March 2025.

Act-One was the company's first motion driven animation feature. It was released on October 25, 2024 and built on top of the Gen-3 Alpha model. Act-One generated character performances from a single driving video and a character image, but the transfer was limited to facial movement: eye lines, micro-expressions, lip motion, and head turns. Body posture and hand movement were inferred but not directly captured from the input, which restricted Act-One to dialogue heavy shots, head and shoulders compositions, and similarly framed work. Runway's announcement post described Act-One as a way to capture performance without rigging or motion capture equipment, and emphasized that the same input video could drive characters in radically different visual styles.

The Act-One release attracted attention from filmmakers and creative directors because the workflow was an order of magnitude simpler than studio mocap. VentureBeat called the feature a game changer, and several industry observers noted that the consistency of facial expression transfer made it usable for short narrative work rather than only for novelty clips. The model also drew skepticism, particularly around the limits imposed by face only tracking and the absence of explicit body and hand control.

Act-Two was developed over the following nine months and was released on July 14, 2025. Runway framed Act-Two as a next generation motion capture model and emphasized two main improvements over Act-One: full body and hand tracking, and higher overall generation quality. The announcement also flagged automatic environmental motion when the character reference is an image rather than a video, meaning the surrounding scene gains natural background motion (foliage, lighting variation, camera drift) during generation rather than remaining static.

Capabilities

Act-Two performs end to end performance transfer in a single pass. The model expects a driving performance video and a character reference, and outputs a generated clip at 24 frames per second. The Runway help documentation places the typical workflow well under 30 seconds of driving footage, with a 3 second minimum that establishes a 15 credit floor for any request. Output runs at 1080p in a 16:9 aspect ratio, with extended durations up to 10 seconds per generation depending on the input length.

The model handles several distinct tracking surfaces at once.

Capability	Description
Head tracking	Pitch, yaw, and roll of the head, plus position relative to the body and the camera.
Facial expression transfer	Micro-expressions, eye direction, eyelid behavior, brow movement, and lip motion synchronized to speech in the driving clip.
Body pose	Full body posture, shoulder and torso orientation, and lower body posture where visible in the input.
Hand tracking	Gesture, finger position, and grip approximations, supporting expressive hand work that Act-One could not produce.
Environmental motion	When the character reference is a still image, Act-Two adds natural background motion to the generated clip rather than leaving the surroundings frozen.
Multi-character dialogue	A separate workflow generates multiple characters speaking in turn within a single scene, with the help documentation describing dialogue clips kept under roughly 30 seconds.

Runway has stated that the model preserves the identity and visual style of the character reference. A reference drawn in a stylized illustration register produces output that retains the original line work, color, and proportions, while a photorealistic reference produces a photorealistic clip. The driving performance contributes only motion and timing, not appearance.

The model is designed for short clips. Longer continuous shots require chunking, in which the user breaks a performance into multiple shorter passes and stitches the results together. Runway's help center recommends keeping individual dialogue passes under about 30 seconds and provides workflow notes on managing continuity between consecutive shots.

Comparison to Act-One

Act-Two's improvements over Act-One are concentrated in three areas: tracking coverage, motion fidelity, and consistency under demanding scene conditions.

Capability	Act-One (October 2024)	Act-Two (July 2025)
Underlying base	Gen-3 Alpha and Turbo	Independent model on Runway platform
Face tracking	Yes	Yes, with improved micro-expression fidelity
Body tracking	Limited or implied	Full body, explicit
Hand tracking	None	Yes
Environmental motion	Static background by default	Automatic when reference is an image
Maximum clip length	Bounded by underlying Gen-3 model	Up to 10 seconds at 1080p per generation
Output resolution	720p typical	1080p at 16:9
API access	Not directly exposed	Available from July 21, 2025
Pricing	Bundled with Gen-3 usage	5 credits per second, 15 credit minimum

Act-One remains usable for face only workflows, particularly dialogue close ups where body work is not needed and the older facial model produces acceptable results at lower cost. For most new production work involving body motion or hand gestures, Act-Two has become the default.

API access and pricing

Act-Two is available through the Runway web app and through the Runway developer API. The web app exposes Act-Two as a workflow on paid plans, including Standard, Pro, Unlimited, and Enterprise. The API was opened on July 21, 2025 and uses the same credit based billing as the rest of the Runway API surface.

Act-Two consumes 5 credits per second of generated video, with a 3 second minimum per request that translates to a 15 credit floor. Credits in the developer portal cost $0.01 each, so a single 3 second generation costs $0.15 and a 10 second generation costs $0.50. By the Runway API pricing reference, this places Act-Two at the same per second rate as Gen-4 Turbo and Gen-3 Alpha Turbo, and below Gen-4.5 at 12 credits per second, Aleph at 15 credits per second, and Veo 3 access through Runway at 40 credits per second.

Model	Credits per second	Approximate API cost for 10 seconds
Act-Two	5	$0.50
Gen-3 Alpha Turbo	5	$0.50
Gen-4 Turbo	5	$0.50
Gen-4.5	12	$1.20
Aleph	15	$1.50
Veo 3 via Runway	40	$4.00

The subscription plans give users a monthly credit allowance that can be spent on Act-Two alongside other Runway models. Standard at $12 per user per month includes 625 credits, Pro at $28 per user per month includes 2,250 credits, and Unlimited at $76 per user per month includes 2,250 credits plus an unlimited explore mode with lower priority queueing. Enterprise pricing is custom and includes the early access window that Act-Two and other new models passed through at launch.

The API has the same input and output structure as other Runway endpoints. Developers submit a driving video URL and a character reference URL (image or video), along with optional parameters that control duration, resolution, and aspect ratio. The job completes asynchronously and returns a generated MP4 file when ready. The asynchronous pattern is similar to that used by Gen-4 and Aleph and supports queue management for batch workloads.

Use in film and television

Act-Two arrived during a period of accelerating studio engagement with Runway. By July 2025 the company had signed custom model partnerships with Lionsgate (September 2024) and AMC Networks (June 2025), and had ongoing exploratory work with several other Hollywood and streaming companies. Lionsgate executives told The Hollywood Reporter that filmmakers across the studio's slate were already using Runway tooling for pre-visualization and post-production tasks.

The practical applications for Act-Two in production cluster around a handful of categories. Pre-visualization is the most obvious. Directors, production designers, and storyboard artists use Act-Two to mock up performance ideas before committing a scene to live action shoots or to traditional animation pipelines. A director can record a smartphone clip of themselves or a reference performer acting out a scene, attach a concept image of the character, and produce a draft of the shot in minutes rather than days. The result is precise enough to communicate intent to an animation team or a producer without spending the production budget that a finished shot would require.

Independent animation projects represent another visible use case. The Runway AI Film Festival, an annual showcase that Runway has run since 2023, received about 6,000 submissions in its 2025 edition. Several of the finalist films at the 2025 festival, held at Alice Tully Hall in Lincoln Center on June 5, 2025, used Runway tooling for performance driven sequences. After Act-Two's release later in July, no-Film-School and other industry publications described the model as a serious option for independent narrative work that needs character animation but cannot afford a full mocap stage.

Game cinematic and VFX teams have shown early interest as well. The full hand and body tracking in Act-Two opens up workflows for in engine cinematics where a director can drive an existing rigged character with a smartphone performance and use the result as a reference for keyframe animators, or in some pipelines as direct input. The output is video rather than skeletal animation data, which means Act-Two does not replace traditional mocap for engine integration. It does, however, replace the reference video that animators or mocap performers would otherwise need to capture in a shoot.

Advertising production is a fourth application. Brand teams use Act-Two to generate stylized character spots without scheduling actor shoots or commissioning full CGI work. The Coca-Cola Christmas campaigns of 2024 and 2025 relied on Runway tooling alongside other generative video models, with character driven sequences benefiting from the consistency between performance and target character that Act-Two and Gen-4 References provide together.

Industry coverage from VP-Land, No Film School, and Restart Reality treated the launch as a significant step in the broader story of generative AI in film and television. The Lionsgate angle was repeatedly raised: with Lionsgate's library now feeding a custom Runway model and Act-Two providing a generative motion capture tool, the studio had a usable end to end stack for low cost previz and concept work.

Comparison to traditional motion capture and competitors

Act-Two competes on different axes against two different categories: traditional optical motion capture and other AI driven character animation systems.

Traditional motion capture, as practiced at studios such as Industrial Light and Magic, Weta FX, and dedicated mocap houses, uses a calibrated array of cameras, retro-reflective markers or markerless depth sensors, and post-processed skeletal solving to produce high precision motion data. The output is a stream of joint angles and position data that can drive a rigged character model in a 3D engine or animation package. Traditional mocap is precise, repeatable, and produces motion data that can be edited frame by frame. It is also expensive, requires a dedicated stage, demands trained performers and operators, and has a long iteration loop.

Act-Two takes the opposite trade. It accepts video from any consumer camera, returns generated video rather than skeletal data, and produces results in minutes. The output cannot be directly edited as motion data because it is pixel based, which means that downstream changes typically require regenerating the clip rather than tweaking individual joints. The motion fidelity is high enough for many uses but is not pixel for pixel equivalent to traditional mocap when reviewed at frame level by VFX teams.

Attribute	Traditional optical mocap	Runway Act-Two
Capture hardware	Multi-camera stage, markers or depth sensors	Any consumer camera or smartphone
Output	Skeletal motion data	Rendered video
Iteration time	Hours to days	Minutes
Cost per minute	Hundreds to thousands of dollars	About $0.50 per 10 seconds via API
Editability after capture	Frame by frame on skeleton	Re-generate the clip
Precision	Sub-millimeter joint tracking	Approximate, perceptual
Typical use	Feature film VFX, AAA games	Previz, indie animation, concept

Within the generative video category, Act-Two competes most directly with Hedra Character, Synthesia, and a handful of smaller services such as LivePortrait. The competitive picture is shaped by what each service is built to do rather than by raw benchmarks. Hedra's Character-3 model, released in early 2025, focuses on audio driven talking head animation from a single image, with strong lip sync accuracy across multiple languages. Synthesia is an enterprise platform for AI avatar video generation, optimized for corporate training, internal communication, and similar fixed format content, with extensive language support and a library of pre-built avatars.

System	Primary input	Primary output	Best fit
Runway Act-Two	Driving performance video plus character reference	Generated video with full body, face, and hand transfer	Cinematic performance capture, previz, indie animation
Hedra Character	Image, text, and audio	Animated talking head with synchronized lip motion	Audio driven avatar dialogue, social content
Synthesia	Script plus avatar selection	Avatar driven corporate video	Training, internal communication, multi-language enterprise video
Live action mocap	Marker or markerless capture on a stage	Skeletal motion data	High precision VFX, AAA games

For a film team that needs a character to deliver a performance with specific gestures and timing, Act-Two is the only system in this list that captures both face and body from a single driving performance and applies them to an arbitrary character. For an enterprise team that needs a generic spokesperson delivering a script in twenty languages, Synthesia remains the better fit. For an audio driven music video where the dialogue is the focus, Hedra is often the preferred tool. The three systems sit in adjacent rather than overlapping product categories most of the time, although the boundaries have begun to blur as each platform adds features the others have shipped.

Reception

Coverage of Act-Two at launch was largely positive. AlternativeTo described the release as Runway adding advanced motion capture and full body tracking to its lineup. No Film School framed it in terms of its potential to disrupt the mocap industry, observing that motion capture had historically been a substantial expense for film, advertising, and games. VP-Land called the release a meaningful update to Runway's character animation stack and singled out the body and hand tracking as the headline change relative to Act-One.

Film industry trade press emphasized the studio context. The Hollywood Reporter coverage of the Lionsgate partnership had already framed Runway as a Hollywood aligned vendor rather than a disruptor, and the Act-Two coverage extended that framing. Variety and Deadline both noted that the release fit a pattern: a steady cadence of feature drops between major model releases (Gen-4 in March 2025, Act-Two and Aleph in July 2025, Gen-4 Image Turbo in August 2025) designed to keep professional users on the platform while engaging studio decision makers.

The enthusiast and creator response on X and YouTube was more mixed. Many creators described Act-Two as the first AI tool that gave them usable body and hand work without a mocap stage. Others noted that the model struggled with specific edge cases: fast camera moves in the driving video, occluded hands, or driving performances where the framing did not match the target character's body proportions. Creators posted side by side comparisons showing clearer micro-expression transfer and better body posture in Act-Two, alongside cases where it introduced unwanted body motion on characters that should have remained largely static.

Criticism around AI in film production extended to Act-Two even though the model itself does not directly displace performers. The 2023 Hollywood strikes had centered in part on AI's role in production, and SAG-AFTRA agreements established disclosure requirements for digital replicas of named performers and bargaining requirements for synthetic performers. Critics raised concerns that tools like Act-Two could shift work away from mocap performers, motion editors, and traditional animators. Runway has consistently positioned its tools as augmenting rather than replacing human work.

On safety and consent, Runway's stated policy is to use automated systems and internal human review to detect content that violates its usage policy, including non-consensual likeness use. Runway also embeds invisible watermarks in generated content. Critics have argued that watermarking and moderation are necessary but not sufficient steps for a tool with the performance transfer capabilities of Act-Two.

Limitations

Act-Two has several limitations that were apparent at launch and that Runway's help documentation and independent reviewers have catalogued.

Clip length is bounded. Each generation tops out at about 10 seconds at 1080p, with the help center describing dialogue passes typically kept under 30 seconds total. Longer continuous performances require chunking and stitching, which introduces continuity work at the seams between consecutive clips.

The driving performance must be reasonably well framed and lit. Strong occlusion of hands, extreme camera angles, or heavy motion blur in the driving video degrades the output. Multi character driving footage is supported through a dedicated dialogue workflow, but solo character work remains the default and is the most reliable.

The model produces video output rather than skeletal animation data. For pipelines that need editable motion on a 3D rig, Act-Two cannot directly substitute for traditional mocap. It can serve as a reference for animators or as a draft for keyframe work, but the rendered nature of the output is a structural limit.

Fast moving hands, complex finger interactions with objects, and tight close ups on hands remain weak points. The model has improved on Act-One in these categories, but artifacts persist, especially in the same broad categories (hand articulation, finger object contact) that affect every major commercial video model in 2025.

The output cannot reliably reproduce a specific performance frame for frame. Identical inputs produce different outputs across runs, which is inherent to diffusion based generation. For applications that require exact reproducibility, this matters; for previz, exploration, and most narrative work, it does not.

Runway has not published a detailed technical paper for Act-Two. Architecture, training data, and parameter count remain undisclosed. The company's public statements describe its models as transformer based diffusion systems trained on curated internal datasets. This opacity is consistent with Runway's stance on other recent models and reflects ongoing copyright litigation against several generative AI companies, including Runway, over training data sourcing.

Act-Two sits alongside Runway Gen-4 and Gen-4 Turbo (image to video), Aleph (in context video editing), and Gen-4 Image (still image generation with reference conditioning). Each model targets a different input type: text prompts, images, existing video, or a recorded performance. Many production pipelines combine multiple Runway models in a single shot.

References

Runway. "Introducing Act-Two." Announced via Runway social channels, July 14, 2025. https://x.com/runwayml/status/1945189222542880909
Runway. "Creating with Act-Two." Runway Help Center. https://help.runwayml.com/hc/en-us/articles/42311337895827-Creating-with-Act-Two
Runway. "Performance Capture with Act-Two." Runway Help Center. https://help.runwayml.com/hc/en-us/articles/42311337895827-Performance-Capture-with-Act-Two
Runway. "Creating Multi-Character Dialogues with Act-Two." Runway Help Center. https://help.runwayml.com/hc/en-us/articles/41748090660499-Creating-Multi-Character-Dialogues-with-Act-Two
Runway. "Act-Two is now available via the Runway API." July 21, 2025. https://x.com/runwayml/status/1947357813027594542
Runway. "API Pricing & Costs." https://docs.dev.runwayml.com/guides/pricing/
Runway. "AI Image and Video Pricing." https://runwayml.com/pricing
Runway. "Introducing Act-One." Runway Research. October 25, 2024. https://runwayml.com/research/introducing-act-one
Runway. "Creating with Act-One on Gen-3 Alpha and Turbo." Runway Help Center. https://help.runwayml.com/hc/en-us/articles/33927968552339-Creating-with-Act-One-on-Gen-3-Alpha-and-Turbo
Runway. "Foundations for Safe Generative Media." Runway Research. https://runwayml.com/research/foundations-for-safe-generative-media
Runway. "Runway's Usage Policy." https://help.runwayml.com/hc/en-us/articles/17944787368595-Runway-s-Usage-Policy
AlternativeTo. "Runway debuts Act-Two with advanced motion capture and full body tracking." July 2025. https://alternativeto.net/news/2025/7/runway-debuts-act-two-with-advanced-motion-capture-and-full-body-tracking/
No Film School. "Runway Launches Next-Gen Motion Capture AI Model With Aim to Revolutionize MoCap Industry." 2025. https://nofilmschool.com/runway-act-two
VP-Land. "Runway's Act-Two is here." 2025. https://www.vp-land.com/p/runway-s-act-two-is-here
Restart Reality. "Runway Act Two: A Game Changer For AI Filmmakers." 2025. https://restartreality.com/runway-act-two-a-game-changer-for-ai-filmmakers/
VentureBeat. "This is a game changer: Runway releases new AI facial expression motion capture feature Act-One." October 2024. https://venturebeat.com/ai/this-is-a-game-changer-runway-releases-new-ai-facial-expression-motion-capture-feature-act-one
SiliconANGLE. "Runway's Act-One uses smartphone cameras to replicate facial expression motion capture." October 22, 2024. https://siliconangle.com/2024/10/22/runways-act-one-uses-smartphone-cameras-replicate-facial-expression-motion-capture/
MarkTechPost. "RunwayML Introduces Act-One Feature." October 23, 2024. https://www.marktechpost.com/2024/10/23/runwayml-introduces-act-one-feature-a-new-way-to-generate-expressive-character-performances-using-simple-video-inputs/
Tom's Guide. "Runway just changed filmmaking forever, Act-1 lets you control AI characters." October 2024. https://www.tomsguide.com/ai/ai-image-video/runway-just-changed-filmmaking-forever-act-1-lets-you-control-ai-characters
Dataconomy. "Runway Act-One Generates Animations From Video And Voice Inputs." October 24, 2024. https://dataconomy.com/2024/10/24/runway-act-one-animations/
The Hollywood Reporter. "Lionsgate Inks Deal With AI Firm to Mine Its Massive Film and TV Library." September 2024. https://www.hollywoodreporter.com/business/business-news/lionsgate-deal-ai-firm-runway-1236005554/
Deadline. "Runway's AI Film Festival, On Hallowed Ground At Lincoln Center, Honors 'Total Pixel Space'." June 6, 2025. https://deadline.com/2025/06/runway-ai-film-festival-new-york-lincoln-center-total-pixel-space-1236425801/
Runway Academy. "Character Animation with Act-Two." https://academy.runwayml.com/acttwo/acttwo-expressive-character-performances
Runway. "Product Updates & Changelog." https://runwayml.com/changelog
ActingPal. "How Actors Will Benefit from AI: Runway Act-Two & New Revenue Opportunities." 2025. https://www.actingpal.com/blog/how-actors-will-benefit-from-ai
Atlabs AI. "Runway Act 2 vs Luma vs Kling vs Pika." 2026. https://www.atlabs.ai/blog/runway-act-two-vs-luma-modify-video
Quasa. "Netflix and Disney Explore Runway AI's Video Tools." 2025. https://quasa.io/media/netflix-and-disney-explore-runway-ai-s-video-tools-a-new-era-for-hollywood-or-a-recipe-for-controversy
Aitools.love. "Compare Synthesia vs Hedra vs Runway." https://aitools.love/compare?tools=synthesia,hedra,runway

Runway Act-Two

Overview

Background: Runway and the path from Act-One to Act-Two

Capabilities

Comparison to Act-One

API access and pricing

Use in film and television

Comparison to traditional motion capture and competitors

Reception

Limitations

See also

References

Improve this article

Overview

Background: Runway and the path from Act-One to Act-Two

Capabilities

Comparison to Act-One

API access and pricing

Use in film and television

Comparison to traditional motion capture and competitors

Reception

Limitations

See also

References

Overview

Background: Runway and the path from Act-One to Act-Two

Capabilities

Comparison to Act-One

API access and pricing

Use in film and television

Comparison to traditional motion capture and competitors

Reception

Limitations

See also

References

Improve this article

Related Articles

NVIDIA Picasso

Luma Dream Machine

Seedance

Wan 2.1-VACE

Wan 2.5

Runway Aleph

Overview

Background: Runway and the path from Act-One to Act-Two

Capabilities

Comparison to Act-One

API access and pricing

Use in film and television

Comparison to traditional motion capture and competitors

Reception

Limitations

See also

References

Related Articles

NVIDIA Picasso

Luma Dream Machine

Seedance

Wan 2.1-VACE

Wan 2.5

Runway Aleph