MedGemma

Google Healthcare AI Open Source AI

7 min read

Updated Jun 27, 2026

Suggest edit History Talk

RawGraph

Last edited

Jun 27, 2026

Fact-checked

In review queue

Sources

9 citations

Revision

v2 · 1,446 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

MedGemma is a collection of open medical models from Google, built on the Gemma 3 architecture and tuned for understanding medical text and medical images. The MedGemma Technical Report describes it as "a new collection of medical vision-language foundation models based on Gemma 3 4B and 27B," first released on 20 May 2025 during Google I/O and distributed as part of Google's Health AI Developer Foundations (HAI-DEF) program. ^[1]^[2]^[9] Rather than a finished clinical tool, MedGemma is a developer foundation model: a starting point that developers fine-tune and validate for specific healthcare and life-sciences applications, not a cleared medical device. On the four-option MedQA benchmark, Google reports the text-only 27B model scores 87.7% and the 4B model 64.4%. ^[1]^[5]^[9]

What is MedGemma?

MedGemma is Google's family of open-weight generative models specialized for medicine. Google Research describes the collection as "generative models based on Gemma 3 that are designed to accelerate healthcare and lifesciences AI development," and the Technical Report states that MedGemma "demonstrates advanced medical understanding and reasoning on images and text, significantly exceeding the performance of similar-sized generative models" while retaining the general capabilities of the base Gemma models. ^[1]^[9]

Google's medical language work began with research systems such as Med-PaLM, which were large but closed. MedGemma takes a different path by reusing the open-weight Gemma family so that the model weights can be downloaded, inspected, run locally, and adapted on private infrastructure. This matters in AI in healthcare, where data governance and the ability to keep patient information on-premises are recurring concerns. ^[1]^[3]

MedGemma is produced by Google DeepMind and Google Research, and it inherits the multimodal design of Gemma 3, which couples a text decoder with a vision encoder. That lineage connects it to Google's broader strategy of releasing domain-specialized derivatives of Gemma, alongside general-purpose vision-language work such as PaliGemma and the proprietary Gemini line. The medical variants are trained on de-identified medical data so that the same general recipe transfers to clinical text and imaging. ^[1]^[2]^[9]

What are the MedGemma model variants?

The initial May 2025 release included a 4B multimodal model and a 27B text-only model. On 9 July 2025 Google added a 27B multimodal model and the separate MedSigLIP image encoder, alongside the publication of the MedGemma Technical Report. A subsequent update, MedGemma 1.5, was released on 13 January 2026 and focused on the 4B multimodal model with expanded imaging support. Sizes below follow Google's own naming, which rounds the larger checkpoints to "27B." ^[1]^[2]^[4]^[9]

Model	Modality	Base	First released	Notes
MedGemma 4B	Multimodal (image + text in, text out)	Gemma 3 4B	20 May 2025	Compute-efficient; pairs the text decoder with a medical SigLIP image encoder
MedGemma 27B (text-only)	Text in, text out	Gemma 3 27B	20 May 2025	Optimized for medical text and inference-time reasoning
MedGemma 27B (multimodal)	Image + text in, text out	Gemma 3 27B	9 July 2025	Adds image and longitudinal electronic-health-record interpretation
MedGemma 1.5 (4B)	Multimodal	Gemma 3 4B	13 January 2026	Adds CT, MRI, whole-slide pathology, longitudinal imaging, anatomical localization

How well does MedGemma perform?

A reported benchmark figure for the text-only 27B model is 87.7% on the MedQA (four-option) medical question-answering set, which Google's Technical Report records with test-time scaling and describes as within a few points of leading open reasoning models at roughly a tenth of the inference cost. That is a 5.4-point gain over the base Gemma 3 27B model, which scores 82.3% on the same benchmark. The 4B model scored 64.4% on MedQA, ranking it among the strongest open models under 8B parameters. These numbers are research evaluations attributed to Google rather than evidence of clinical readiness. ^[1]^[5]^[9]

What can MedGemma do?

The multimodal MedGemma models accept an image together with a text prompt and return free-form text. Typical research uses include generating draft radiology reports, answering visual questions about an image, classifying findings, and summarizing or reasoning over clinical notes. The image encoder underpinning the multimodal variants was pre-trained on de-identified data spanning several modalities: chest X-rays (radiology), dermatology photographs, ophthalmology fundus images, and histopathology (pathology) slides. In MedGemma, images are normalized to 896 by 896 resolution and encoded into 256 tokens. ^[1]^[3]

In one evaluation Google reported, 81% of chest X-ray reports generated by MedGemma 4B were judged by a US board-certified radiologist as accurate enough to lead to similar patient management as the original report. Google frames such results as promising research signals, not as validated clinical performance. ^[1]

The January 2026 MedGemma 1.5 4B update extended the model to higher-dimensional imaging. It added the ability to interpret 3D computed tomography (CT) and magnetic resonance imaging (MRI) volumes, whole-slide histopathology images, time-series of chest X-rays for longitudinal review, anatomical localization, and structured extraction from lab reports. Google reported gains over the earlier 4B model on internal tasks, including CT disease classification (61% versus 58%) and MRI findings (65% versus 51%). The same release introduced MedASR, a separate open speech-to-text model fine-tuned for medical dictation. ^[4]^[6]

What are Health AI Developer Foundations and MedSigLIP?

MedGemma sits inside Health AI Developer Foundations, a Google collection of open models, tooling, and recipes for building medical AI that was introduced in November 2024. HAI-DEF predates MedGemma and already included embedding-oriented foundation models such as CXR Foundation, trained on more than 800,000 chest X-rays, and Path Foundation for pathology, along with related models for dermatology and health acoustics. MedGemma added generative, instruction-following capability to that toolkit. ^[2]^[7]

Alongside the July 2025 expansion, Google released MedSigLIP, a lightweight vision-language encoder adapted from SigLIP-400M. It contains roughly a 400M-parameter vision encoder paired with a text encoder, operates at 448 by 448 image resolution, and was trained on de-identified medical image and text pairs (chest X-rays, dermatology, ophthalmology, histopathology, and CT and MRI slices) mixed with natural images to retain general visual understanding. The same encoder powers the vision capabilities of the MedGemma multimodal models. Google recommends MedSigLIP for tasks with structured outputs, such as data-efficient classification, zero-shot classification, and semantic image retrieval, where text generation is not required, while suggesting MedGemma for tasks that need generated text. ^[3]^[8]

Is MedGemma a clinical product?

No. Google is explicit that MedGemma is a development starting point rather than a clinical product. The model card states that MedGemma is intended to enable more efficient development of downstream healthcare applications and that it is not meant to be used "without appropriate validation, adaptation and/or making meaningful modification by developers for their specific use case." Outputs "are not intended to directly inform clinical diagnosis, patient management decisions, treatment recommendations, or any other direct clinical practice applications" and "should be considered preliminary and require independent verification, clinical correlation, and further investigation." ^[3]

The documentation also notes evaluation gaps. The multimodal capabilities were assessed primarily on single-image tasks and had not been validated for multi-image comprehension, and the models were not optimized for multi-turn conversational use. Google characterizes MedGemma as not yet clinical-grade and likely to require further fine-tuning before deployment, and the models are not cleared or approved as medical devices. Anyone building with them is responsible for the regulatory, safety, and validation work appropriate to their jurisdiction and use case. ^[2]^[3]

Is MedGemma open source, and where can you get it?

The MedGemma weights are open and distributed on Hugging Face and through Google Cloud's Vertex AI Model Garden, under the Health AI Developer Foundations terms of use, which users must accept before downloading. Because the weights are open, developers can run the models locally or on their own cloud infrastructure and fine-tune them on proprietary data. Google has supported the ecosystem with tutorial notebooks, reference code on GitHub, and community programs, including a MedGemma Impact Challenge that drew hundreds of participating teams and a Kaggle challenge tied to the 1.5 release. ^[1]^[2]^[6]

References

MedGemma: Our most capable open models for health AI development - Google Research blog ↩
MedGemma | Health AI Developer Foundations - Google for Developers ↩
google/medgemma-4b-it - Hugging Face model card ↩
Next generation medical image interpretation with MedGemma 1.5 and medical speech to text with MedASR - Google Research blog ↩
google/medgemma-27b-it - Hugging Face model card ↩
Google AI Releases MedGemma-1.5: The Latest Update to their Open Medical AI Models for Developers - MarkTechPost ↩
Health AI Developer Foundations - Google Research (arXiv) ↩
google/medsiglip-448 - Hugging Face model card ↩
MedGemma Technical Report - Google Research and Google DeepMind (arXiv:2507.05201) ↩

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

1 revision by 1 contributors · full history

Suggest edit

What links here

Gemma 3 Google Research Med-PaLM 2 PubMedQA TxGemma

What is MedGemma?

What are the MedGemma model variants?

How well does MedGemma perform?

What can MedGemma do?

What are Health AI Developer Foundations and MedSigLIP?

Is MedGemma a clinical product?

Is MedGemma open source, and where can you get it?

References

Improve this article

Related Articles

Med-PaLM

Med-PaLM 2

ESM3

EvolutionaryScale

TensorFlow

Agent2Agent Protocol

What links here

Related Articles

Med-PaLM

Med-PaLM 2

ESM3

EvolutionaryScale

TensorFlow

Agent2Agent Protocol

What links here