Model hubs
Last reviewed
May 30, 2026
Sources
No citations yet
Review status
Needs citations
Revision
v2 ยท 3,078 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
May 30, 2026
Sources
No citations yet
Review status
Needs citations
Revision
v2 ยท 3,078 words
Add missing citations, update stale details, or suggest a clearer explanation.
A model hub is an online platform or repository where people discover, share, version, and download machine learning models. At its simplest, a hub is a place to find a trained model, read documentation about how it behaves, and pull down the weights so you can run it yourself. Most of the well known hubs do more than that. They host the model files, track changes over time, render documentation pages called model cards, run small demo apps, and expose an API so you can fetch a model from inside your own code with a line or two.
The idea borrows heavily from software development. If GitHub is where source code lives, a model hub is where trained weights live, and the comparison gets made so often that Hugging Face is routinely described as "the GitHub of machine learning."[1] The analogy is not perfect, since model files are large binary blobs rather than diffable text, but the social mechanics are similar: public repositories, forks, pull requests, stars, and a culture of sharing built on top of open-source tooling.
This article covers what model hubs do, the major ones in use as of 2026, the features they tend to share, and the governance and security problems that come with hosting model weights at scale.
A few capabilities show up across almost every hub, in one form or another.
Discovery is the starting point. Hubs let you search and filter models by task (text generation, image classification, speech recognition, and so on), by framework, by license, by size, and increasingly by benchmark scores. Popularity signals such as download counts and likes help surface what other people actually use.
Hosting the weights is the core job. Trained models are large, sometimes hundreds of gigabytes, so hubs need real storage infrastructure and fast delivery. Hugging Face, for example, stores repositories in a Git based system layered on a chunk deduplicating backend it calls Xet, which splits large files into reusable pieces to speed up uploads and downloads.[2]
Versioning matters because models change. Authors retrain, fix bugs, release quantized variants, or update the documentation. Git based hubs give you commit history, branches, and diffs so a specific revision can be pinned and reproduced.
Documentation travels with the model in the form of a model card. The concept comes from a 2018 paper, "Model Cards for Model Reporting," by Margaret Mitchell, Timnit Gebru, and colleagues at Google, presented at the 2019 ACM Conference on Fairness, Accountability, and Transparency.[3] A model card describes what the model is for, how it was trained and evaluated, its known limitations and biases, and any ethical considerations. On Hugging Face the card is just the repository's README, a Markdown file with a structured metadata block at the top that the site reads to power search and filtering.[4]
Many hubs also offer inference and demos. Some let you try a model in the browser through an embedded widget or a small hosted app (Hugging Face calls these Spaces). Others, like Replicate, are built primarily around running the model for you behind an API so you never touch the weights at all. Programmatic access through client libraries and REST endpoints is standard, which is what makes a hub usable from a training script rather than just a website.
The Hugging Face Hub is the dominant general purpose model hub. The company was founded in 2016 by Clement Delangue, Julien Chaumond, and Thomas Wolf, originally as a consumer chatbot. It pivoted toward open machine learning tooling after its open-source library (now called Transformers) took off, helped along by a widely used PyTorch reimplementation of Google's BERT model in 2018.[5]
By its own count the Hub hosts more than two million models, alongside roughly 1.5 million datasets and 1.5 million demo apps (Spaces), all publicly browsable.[2] Repositories are Git based, with model cards, commit history, branches, inference widgets, and over a dozen library integrations. The Hub supports gated models, where an author can require users to agree to a license or share contact details before downloading the weights. Meta's Llama releases used this mechanism, initially routing access requests through a Meta website.[6] Hugging Face also runs malware scanning on uploaded files, which matters given the security issues discussed below.
Kaggle, the data science competition platform owned by Google, introduced Kaggle Models in 2023 as a catalog of pre-trained models that plug directly into Kaggle Notebooks.[7] It hosts models from Google Research, DeepMind, and other organizations, and later opened up for community uploads. Kaggle Models is the destination that TensorFlow Hub folded into (see below), so it now carries those assets too. It sits alongside Kaggle's large collection of public datasets, which is part of the appeal: you can grab a model and the data to fine-tune it in the same place.
TensorFlow Hub was Google's repository of reusable TensorFlow modules, served from tfhub.dev. It has been wound down. Starting in November 2023, tfhub.dev links began redirecting to Kaggle Models, and unmigrated assets were deleted in March 2024.[8] The tensorflow_hub Python library still works for loading models that were originally published there, but Google now points developers at the Kaggle equivalents. In practice TensorFlow Hub no longer exists as a separate destination; it lives on inside Kaggle.
PyTorch Hub takes a lighter approach. Rather than hosting weights itself, it loads models straight from GitHub repositories. A repository adds a file called hubconf.py that defines entry points, and a user calls torch.hub.load("owner/repo", "model_name") to pull the code and (optionally) pre-trained weights, which are usually attached to a GitHub release or fetched from a URL.[9] It is less a storefront than a convention for making research models loadable in one line, and the pytorch.org/hub page curates a list of participating projects.
The ONNX Model Zoo is a collection of pre-trained models in the ONNX format, the open interchange standard that lets a model move between frameworks and runtimes. It was a GitHub repository (onnx/models) covering vision, language, and other tasks. As of 2025 it is deprecated and kept for historical reference only: large file downloads through Git LFS were discontinued on July 1, 2025, and the models were moved to Hugging Face under the onnxmodelzoo account.[10] Its decline is a small illustration of how much model sharing has consolidated onto Hugging Face.
NGC is Nvidia's catalog of GPU optimized software. It is broader than a pure model hub: it hosts containers for deep learning frameworks and HPC, Helm charts, SDKs, and more than a hundred pre-trained models tuned for Nvidia GPUs across tasks like natural language processing, vision, speech, and recommendation.[11] The models are meant to be used directly for inference or fine-tuned with transfer learning, and the catalog is aimed at enterprise deployment, with security scan reports attached to container images. It is the hub you reach for when the priority is running well on Nvidia hardware rather than browsing the open research frontier.
Replicate hosts open and custom models and runs them for you behind a cloud API, so you pay for compute by usage instead of managing GPUs. Models are packaged with Cog, an open-source tool Replicate released in 2019 that wraps a model and its dependencies into a container and generates an API server automatically.[12] In November 2025, Cloudflare announced it was acquiring Replicate, citing a catalog of more than 50,000 containerized models that it planned to integrate into its own network.[13] The pitch is deployment without infrastructure work, which is a different value proposition from a hub that just stores files.
Ollama is a tool for running large language models locally, built on top of llama.cpp, and it ships with its own curated library of models. You run a command like ollama run llama3 and the tool downloads a packaged model and serves it through a local REST API. The models are distributed in the GGUF format, a compact file layout designed for efficient local inference that supports quantization (compressing weights to 4, 5, or 8 bits) to shrink models enough to run on consumer hardware.[14] Ollama can also pull GGUF files directly from Hugging Face, so the two ecosystems overlap. This is a hub oriented entirely around local, private use rather than cloud hosting.
ModelScope is an open Model as a Service platform launched by Alibaba's DAMO Academy in November 2022, and it is frequently described as China's counterpart to Hugging Face.[15] It hosts multimodal models spanning text, image, speech, and video, including Alibaba's own Qwen family, plus datasets with Git version control. As of October 2025 the community reported gathering over 120,000 open-source models.[16] ModelScope is the main hub for a large slice of the Chinese open model ecosystem.
Civitai is a community hub focused on image generation models, built around Stable Diffusion and similar diffusion models. Founded by Justin Maier, it hosts checkpoints, LoRAs (small add-on modules that teach a base model a specific style or subject and can be trained on consumer GPUs), and embeddings, and it includes an on-site image generator and a LoRA training tool.[17] Civitai has been at the center of ongoing debates about AI generated explicit content and deepfakes. In 2024 and 2025 it tightened its rules, banning content based on the likeness of real people and adjusting its NSFW policies under pressure from payment processors including Mastercard and Visa.[18] It is a useful reminder that hubs are not neutral pipes; what they choose to host is a policy decision with real consequences.
The big cloud providers run their own curated model catalogs, sometimes called model gardens. These are less about open community sharing and more about giving customers a vetted set of models they can deploy inside the provider's infrastructure.
Google's Vertex AI Model Garden is a curated catalog of enterprise ready models. It includes Google's own foundation models such as Gemini and Imagen, third party proprietary models from partners, and open-source options, with thousands of additional models available from Hugging Face for tuning and deployment on Vertex AI.[19]
Amazon Web Services (AWS) splits this across two services. Amazon SageMaker JumpStart offers a wide selection of foundation models, including models from Hugging Face, Meta, and others, which you deploy onto your own SageMaker compute and can fine-tune. Amazon Bedrock instead serves models from providers like Anthropic, Meta, Cohere, Mistral AI, and Amazon through a serverless API, billed by usage, with the infrastructure hidden.[20] The practical difference is control versus convenience: JumpStart gives you the weights on managed compute, while Bedrock gives you an endpoint and abstracts everything behind it.
Microsoft's model catalog lives in Azure AI Foundry, which replaced Azure AI Studio in late 2024 (and has since been rebranded again under the Microsoft Foundry name). It provides access to thousands of models from Microsoft, OpenAI, Anthropic, Meta, Mistral AI, Hugging Face, Nvidia, and others, deployable either through a serverless API or on managed compute.[21]
| Hub | Operator | What it hosts | Access model |
|---|---|---|---|
| Hugging Face Hub | Hugging Face | Models, datasets, and demo apps across all tasks | Open and public; gated and private repos supported; free with paid tiers |
| Kaggle Models | Google (Kaggle) | Pre-trained models, integrated with notebooks and datasets | Free; community and first-party uploads |
| TensorFlow Hub | Reusable TensorFlow modules (legacy) | Deprecated; merged into Kaggle Models | |
| PyTorch Hub | PyTorch / Meta | Research models loaded from GitHub repos | Open; weights pulled from GitHub via torch.hub |
| ONNX Model Zoo | ONNX project | Pre-trained models in ONNX format (archived) | Deprecated; moved to Hugging Face |
| NGC catalog | Nvidia | GPU optimized containers, SDKs, and pre-trained models | Free account; enterprise oriented |
| Replicate | Replicate (acquired by Cloudflare) | Open and custom models run via API | Hosted inference; pay per use |
| Ollama library | Ollama | Local LLMs in GGUF format | Free; runs locally on your own machine |
| ModelScope | Alibaba DAMO Academy | Multimodal models and datasets | Open and public; free |
| Civitai | Civitai | Image generation checkpoints, LoRAs, embeddings | Free with creator monetization; content moderated |
| Vertex AI Model Garden | Google Cloud | First-party, partner, and open models | Cloud service; deploy within Vertex AI |
| SageMaker JumpStart / Bedrock | Amazon (AWS) | Foundation models for deployment or serverless API | Cloud service; managed compute or per-use API |
| Azure AI Foundry catalog | Microsoft | Curated models from many providers | Cloud service; serverless or managed compute |
Beyond hosting files, hubs converge on a recognizable set of features.
Model cards and metadata give each model a documentation page, usually a README with structured fields for task, language, license, and evaluation results. This metadata is what powers filtered search and lets tooling understand a model without a human reading the prose.
Popularity and usage signals such as download counts and likes help surface widely used models, similar to stars on GitHub. They are a rough proxy for trust, though a noisy one.
Gated and licensed access lets authors put conditions in front of the weights. A user may have to accept a license, agree to acceptable-use terms, or share contact information before downloading. This is how many "open weight" models that are not freely licensed get distributed, including some Llama releases.[6]
Demos and inference appear in several forms: in-browser widgets, hosted apps like Hugging Face Spaces, or full inference APIs as on Replicate and the cloud catalogs. These let people evaluate a model without setting up an environment.
Programmatic access through client libraries and REST APIs is what turns a website into infrastructure. A training script can download a model, and a content moderation pipeline can manage who has access, all without anyone clicking through a browser.[6]
Hosting model weights for anyone to download raises a set of problems that the hubs are still working through.
Licensing and provenance are murky. A model on a hub may be released under a permissive license, a restrictive "open weight" license that limits commercial use, or terms that are genuinely unclear. Models are also frequently derived from other models, fine-tuned or merged, and the chain of what was trained on what is not always documented. Tracing the provenance of weights and the data behind them is an unsolved governance question.
Security of model weights is a real and active threat. Many models are distributed as Python pickle files, and the pickle format can execute arbitrary code the moment a file is deserialized. Researchers have repeatedly found malicious models on Hugging Face that abuse this to run code on a victim's machine when the model is loaded, sometimes using broken or unusual packaging to slip past automated scanners.[22] One study found that a large share of repositories still contain pickle files, including some among the most downloaded models.[23]
The main mitigation is a safer file format. Safetensors, developed by Hugging Face, stores only the numeric tensors and a small JSON header, with no capacity to carry executable code, so loading a safetensors file cannot run arbitrary code the way a pickle can. It also loads faster through memory mapping.[24] It has become the default for many models, although a converted model often keeps its old pickle file around for compatibility, which keeps the risk alive. Hubs also run malware scanning on uploads as a backstop, but scanning is not foolproof, as the evasion research shows.[22]
Reproducibility is a softer but persistent concern. A model's behavior depends on the exact weights, the code that runs it, the framework version, and sometimes hardware. Pinning a specific repository revision helps, which is one reason Git based versioning matters, but fully reproducing a result from a downloaded model is harder than it looks.
Content moderation falls on hubs once they grow large enough to attract misuse. Civitai's repeated policy changes around explicit content and deepfakes are the clearest example, but any hub that lets the public upload eventually has to decide what it will and will not host.[18] These are editorial choices dressed up as infrastructure, and they shape what the open model ecosystem actually contains.