Veo 3.1

6 min read

Updated Jul 23, 2026

Veo 3.1 is a video generation model released by Google DeepMind on October 15, 2025, as an incremental update to Veo 3. It keeps the headline capability that defined its predecessor, native generation of synchronized audio alongside video, and layers on richer sound, better prompt adherence, higher image-to-video fidelity, and a set of creative controls aimed at filmmakers and developers. The update was distributed simultaneously across Google's consumer and developer surfaces: the Flow filmmaking tool, the Gemini app, the Gemini API, and Vertex AI ^[1]^[2]^[3]. As with every model in the family, Veo 3.1 outputs carry a SynthID watermark ^[3].

Background (Veo 3)

Google's video work runs through three earlier releases. The original Veo arrived at Google I/O in May 2024, followed by Veo 2 in December 2024, which pushed resolution and motion quality. Veo 3, announced at I/O in May 2025, was the step change: it generated audio natively, including dialogue, sound effects, and ambient noise tied to the visuals. Google DeepMind chief executive Demis Hassabis framed the moment by saying the field was "emerging from the silent era of video generation" ^[4]^[5].

Veo 3 became the engine behind Flow, Google's web-based tool for stitching AI clips into longer sequences. By the time Veo 3.1 shipped, Google said Flow users had generated more than 275 million videos since the tool launched in May 2025 ^[1]. Veo 3.1 was positioned not as a clean-sheet successor but as a refinement of that same model, sharing its core audio architecture while improving fidelity and adding production controls.

New features

The most consistent theme across Google's announcements is that audio was extended to capabilities that previously produced silent output. At launch Google said it was, "for the first time," bringing audio to "Ingredients to Video," "Frames to Video," and "Extend" ^[2]. Alongside that, the company described "stronger prompt adherence and improved audiovisual quality when turning images into videos" ^[2].

The feature set breaks down as follows:

Feature	What it does
Ingredients to Video	Combines multiple reference images to control characters, objects, and style, synthesizing them so identity and background stay consistent across shots ^[2]^[3]
Frames to Video (first and last frame)	Generates a transition between a supplied start image and end image, now with audio ^[2]
Extend (scene extension)	Continues a clip into a longer sequence, generating each new segment from the final second of the previous one, allowing videos of a minute or more ^[2]
Insert	Adds an object or element into an existing scene, with the model handling shadows and lighting so it blends in ^[1]^[2]
Remove	Erases unwanted objects and reconstructs the background behind them; announced as coming after launch ^[1]^[2]

Google also emphasized finer narrative direction, the ability to specify what should happen at particular moments inside a clip rather than only at the start ^[1]. On the developer side, the Gemini API version of the update foregrounded three improvements: enhanced Ingredients to Video that "intelligently synthesizes your inputs to preserve character identity and background details," native 9:16 vertical generation built for mobile rather than cropped from landscape, and sharper 1080p plus 4K output ^[3].

The object-removal tool was the main feature not available on day one. Google described it as forthcoming in Flow at the time of the announcement ^[1]^[2].

Availability

Veo 3.1 launched across four surfaces at once. For consumers and creators, it appeared in Flow and in the Gemini app; for developers and enterprises, in the Gemini API (where it carries the model identifier veo-3.1-generate-preview) and in Vertex AI ^[1]^[2]^[3]. The Gemini app rollout went to Google AI Pro and Ultra subscribers first as part of the October "Gemini Drop," with broader access following ^[6].

The technical envelope is well documented in the Gemini API specification. Veo 3.1 produces clips of 4, 6, or 8 seconds; 8-second output is required for 1080p, 4K, or when reference images are used. It defaults to 720p with 1080p and 4K available, and supports 16:9 (default) and 9:16 aspect ratios. Audio is "always on," generated natively with the video ^[3].

Surface	Audience	Notes
Flow	Filmmakers, creators	Full feature set; object removal arrived later
Gemini app	Consumers	Rolled out to AI Pro and Ultra subscribers first ^[6]
Gemini API	Developers	Model IDs `veo-3.1-generate-preview` and `veo-3.1-fast-generate-preview` ^[3]
Vertex AI	Enterprises	Scene extension noted as coming after launch ^[2]

Google stated that these capabilities, "along with our SynthID digital watermark, are available today in the Gemini API and Vertex AI for enterprises" ^[3]. SynthID embeds an imperceptible signal into the generated pixels that Google's detection tooling can later read, the same provenance mechanism used across earlier Veo models ^[3].

Veo 3.1 Fast

Google launched a faster, lower-cost companion model, Veo 3.1 Fast, carrying the Gemini API identifier veo-3.1-fast-generate-preview ^[3]. In the developer announcement the company said Veo 3.1 Fast would be "available soon, offering a faster and more cost-effective option for video creation." Google describes the Fast tier as letting "developers create videos with sound while maintaining high quality and optimizing for speed and business use cases," and points to uses such as rapid A/B testing of creative concepts and apps that need to quickly produce social media content ^[3]. The Fast model retains native audio rather than dropping it for speed.

Several third-party API resellers and pricing trackers have published per-second and per-video figures for the Fast tier, but Google's framing at launch was qualitative: faster and cheaper than the standard model, aimed at high-volume and iterative workloads.

Reception

Coverage treated Veo 3.1 as an iterative but meaningful update rather than a generational leap. TechCrunch summarized it plainly, writing that "Veo 3.1 builds on May's Veo 3 release and generates more realistic clips and adheres to prompts better," and highlighted the editing additions and the extension of audio across Flow's features ^[1]. The release landed during a period of intense competition in generative video, following OpenAI's Sora work and a broader race among labs to pair convincing motion with synchronized sound.

The practical emphasis in commentary fell on continuity and control: Ingredients to Video for keeping a character or product consistent across shots, scene extension for getting past the short native clip length, and the move to bring audio to image-driven workflows. For developers, the addition of native vertical video and higher-resolution output in the Gemini API was read as a signal that Google was targeting production use rather than demos ^[3]. Google has continued to iterate on the model since release, including later updates to its vertical-video and reference-image features.

References

^TechCrunch, "Google releases Veo 3.1, adds it to Flow video editor." techcrunch.com/...3-1-adds-it-to-flow-video-editor
^Google, "Bringing new Veo 3.1 updates into Flow to edit AI video." blog.google/...veo-updates-flow
^Google, "Enhanced Veo 3.1 capabilities are now available in the Gemini API." blog.google/...veo-3-1-gemini-api ; Google AI for Developers, "Generate videos with Veo." ai.google.dev/...video
^TechCrunch, "Google's Veo 3 can generate videos and soundtracks to go along with them." techcrunch.com/...oundtracks-to-go-along-with-them
^Wikipedia, "Veo (text-to-video model)." en.wikipedia.org/...Veo_(text-to-video_model)
^TechBuzz, "Google Launches Veo 3.1 and Major Gemini App Updates." techbuzz.ai/...eo-3-1-and-major-gemini-app-updates

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

1 revision by 1 contributors · v2 · 1,221 words · full history

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

Suggest edit

What links here

AI Video Generation Best AI Video Generators Invideo AI Synthesia 3.0 Veo Veo 2 Veo 3

Background (Veo 3)

New features

Availability

Veo 3.1 Fast

Reception

References

Improve this article

Related Articles

Veo 3

Gemini Omni

Veo

Veo 2

NVIDIA Picasso

Pika (video generation)

What links here

Related Articles

Veo 3

Gemini Omni

Veo

Veo 2

NVIDIA Picasso

Pika (video generation)

What links here