DolphinGemma

AI for Science Google Speech & Audio AI

6 min read

Updated Jun 3, 2026

Suggest edit History Talk

RawGraph

Last edited

Jun 3, 2026

Fact-checked

In review queue

Sources

5 citations

Revision

v1 · 1,199 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

DolphinGemma is an audio language model developed by Google to help scientists analyze the vocalizations of wild dolphins. Built on the same research that underpins Google's Gemma family of open models, it was announced on April 14, 2025, which is observed as National Dolphin Day in the United States.^[1]^[2] The project is a collaboration between Google, the Wild Dolphin Project (WDP), and researchers at the Georgia Institute of Technology, and it draws on decades of recordings of Atlantic spotted dolphins. Rather than claiming to translate dolphin speech, the model is designed to find structure in dolphin sounds and predict which sound is likely to follow a given sequence, much as a text model predicts the next word.^[1]

Background

Dolphins produce a rich repertoire of sounds, and researchers have long suspected that this repertoire carries social information. Atlantic spotted dolphins (Stenella frontalis) use signature whistles that appear to function like names, burst-pulse "squawks" during conflict, and click "buzzes" during courtship or while chasing sharks.^[1] Cataloging these sounds and correlating them with behavior is slow, manual work, and the sheer volume of acoustic data is difficult for humans to sift through. The premise behind DolphinGemma is that a model trained on the structure of human language can be repurposed to surface patterns in a non-human acoustic system.^[2]

The model is part of Google's broader Gemma effort, the lightweight open models that share technology with the Gemini family. Work on the project has been associated with Google DeepMind, and one of its named contributors is Dr. Thad Starner, who holds appointments at both Google DeepMind and Georgia Tech.^[3] On the field-research side, the work is led by Dr. Denise Herzing, founder and research director of the Wild Dolphin Project.^[3]

The Wild Dolphin Project partnership

The Wild Dolphin Project supplies the data that makes the model possible. Founded in 1985, the WDP runs what it describes as the world's longest-running underwater dolphin research project, focused on a specific community of Atlantic spotted dolphins in the Bahamas.^[1] Over roughly four decades, researchers have built a labeled archive of underwater audio and video, organized by individual dolphin, by behavioral context, and across generations of the same community.^[1]^[2]

That archive is what gives DolphinGemma a usable training signal. Because the sounds are paired with observed behavior, the WDP corpus lets the model learn associations between particular vocal patterns and the situations in which they occur, instead of treating the audio as an undifferentiated stream.^[1] Georgia Tech contributes the engineering and the field hardware, while Google supplies the model architecture and the underlying audio technology.^[2]

How the model works

DolphinGemma is, in Google's framing, strictly audio-in and audio-out. It does not deal in words or images. It takes in sequences of natural dolphin sounds, identifies recurring patterns and structure, and predicts the most likely subsequent sound in the sequence.^[1] The same next-token prediction that lets a language model continue a sentence is applied here to a stream of dolphin vocalizations.

Two pieces of Google audio technology make this work. The model uses the SoundStream tokenizer to turn raw dolphin sounds into a compact, machine-readable representation, and those tokens are then processed by a model architecture suited to long, complex sequences.^[1]^[4] The result is a relatively small model of roughly 400 million parameters.^[1]^[2] That compact size is deliberate: it lets the model run on consumer hardware in the field rather than on a server.

Attribute	Detail
Developer	Google, with the Wild Dolphin Project and Georgia Tech
Announced	April 14, 2025 (National Dolphin Day)
Base	Gemma open models
Parameters	~400 million
Modality	Audio-in, audio-out
Audio tokenizer	SoundStream
Training data	Wild Dolphin Project archive (Atlantic spotted dolphins, since 1985)
Field hardware	Google Pixel phones

On-device deployment

The roughly 400-million-parameter size means DolphinGemma can run directly on the Google Pixel phones the WDP already carries into the water, removing the need for bulky, power-hungry custom equipment.^[1]^[2] Earlier field work used a Pixel 6 to run the model for real-time listening and pattern recognition. For the 2025 research season, the team planned to move to a setup built around the Pixel 9, housed in a waterproof rig, which adds speaker and microphone functions and enough processing headroom to run deep-learning models and template-matching algorithms at the same time.^[1]^[4]

Running on the phone in the field matters for a second reason: it feeds into an interactive system called CHAT, short for Cetacean Hearing Augmentation Telemetry, an underwater computer interface developed by the WDP with Georgia Tech.^[1]^[5] In CHAT experiments, researchers play novel synthetic whistles that are assigned to objects the dolphins like, such as sargassum seaweed or a scarf, then watch to see whether the dolphins mimic those whistles to request the items. DolphinGemma's predictive abilities are meant to help the system recognize a dolphin's mimic quickly and relay it to a researcher through bone-conducting headphones, supporting a basic two-way exchange rather than a translation of natural dolphin language.^[1]^[5]

Goals and limitations

The stated goal of DolphinGemma is to accelerate the search for structure in dolphin communication and, in the longer term, to support research into a shared, simplified vocabulary through CHAT.^[1] Both Google and outside coverage have been careful to qualify what this means. The model finds and predicts patterns; it does not decode or translate a dolphin "language," and the project does not claim that dolphins have language in the human sense.^[1] Even with machine assistance, interpreting the meaning behind clustered sound types remains the central unsolved problem, and the CHAT vocabulary built up over years of work is still very small.^[3]

There are hard physical limits as well. Humans underwater struggle to localize where a sound comes from, and some dolphin sounds fall outside the range of human hearing, which is part of why automated analysis is attractive in the first place.^[3] DolphinGemma was trained on Atlantic spotted dolphins specifically, so applying it to other cetaceans such as bottlenose or spinner dolphins is expected to require fine-tuning on those species' sounds.^[1]^[2] Popular headlines describing the project as "talking to dolphins" therefore run ahead of what the model actually does.

Availability

Google said it intends to release DolphinGemma as an open model, with availability planned for the summer of 2025.^[1]^[2] The open release is framed as a way to let researchers studying other dolphin populations and other cetacean species adapt the model to their own acoustic data, building on the open-model approach of the broader Gemma 3 lineup.^[1]^[2] As described at announcement, the model was still in active development, with the open weights and the next field season both positioned as upcoming milestones rather than completed work.^[4]

References

DolphinGemma: How Google AI is helping decode dolphin communication - Google blog ↩
DolphinGemma - Google DeepMind ↩
New Google AI Allows Humans To Have Conversations With Dolphins - 21st Century Tech Blog ↩
Forget Live Translate: Google is now using Pixel phones to talk to dolphins - Android Authority ↩
Google develops AI model to help researchers decode dolphin communication - SiliconANGLE ↩

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

Suggest edit

What links here

SoundStream

Background

The Wild Dolphin Project partnership

How the model works

On-device deployment

Goals and limitations

Availability

References

Improve this article

Related Articles

AI co-scientist (Google)

SoundStream

AudioLM

Academic Research

GeoAI

Science ChatGPT Plugins

What links here

Related Articles

AI co-scientist (Google)

SoundStream

AudioLM

Academic Research

GeoAI

Science ChatGPT Plugins