# Static

> Source: https://aiwiki.ai/wiki/static
> Updated: 2026-06-27
> Categories: Machine Learning
> License: CC BY 4.0 (https://creativecommons.org/licenses/by/4.0/)
> From AI Wiki (https://aiwiki.ai), the free encyclopedia of artificial intelligence. Reuse freely with attribution to "AI Wiki (aiwiki.ai)".

*See also: [Machine learning terms](/wiki/machine_learning_terms)*

## What does static mean in machine learning?

In [machine learning](/wiki/machine_learning), **static** means **offline** (also called **batch**): the model is trained once on a fixed dataset, or its predictions are computed in advance and cached, rather than being updated or generated continuously as new data and requests arrive. Google's Machine Learning Glossary uses static as a synonym for offline in three closely related places: a static model is one that is "trained offline and does not change" [1]; static training is the act of training that model once on a fixed dataset; and static inference (also called offline inference or batch inference) is "the process of a model generating a batch of predictions and then caching (saving) those predictions" [1]. The opposite term in every case is [dynamic](/wiki/dynamic), which covers continuously retrained models, online training, and on demand inference.

The label is useful because production ML systems must answer two separate questions about the time dimension: when does the model learn, and when does it produce predictions. Static answers "in advance" for both. Dynamic answers "as data arrives" or "as requests arrive." Most real systems mix the two, but the static side of the line gets a lot of work done at lower operational cost, which is why it remains the default starting point for many teams.

## What is static (offline) training?

Static training, also called offline training, means the model is fit once on a fixed snapshot of data and then deployed. As Google's Machine Learning Crash Course puts it, "Static training (also called offline training) means that you train a model only once. You then serve that same trained model for a while." [2] There is no online update loop. If the team wants the model to reflect newer data, someone schedules a fresh [training](/wiki/training) run, often weekly or monthly, and replaces the deployed artifact. Between those runs the parameters do not move.

The advantages of static training are mostly practical:

- The pipeline only has to work once per training cycle, so you can verify the model carefully before it goes into production. That includes offline evaluation on a held out set, fairness checks, calibration plots, and red team probes that would be slow or risky to run on a moving target.
- Operational overhead is lower. There is no streaming training infrastructure to babysit, no checkpoint rollbacks to design, and no continuous monitoring of the training job itself.
- It is cheaper. Compute is paid for in scheduled bursts rather than as a constant background expense.
- Reproducibility is easier. A static model is just a file, plus the data and code that produced it. You can rebuild it, diff it against an earlier version, and audit it.

The trap is staleness. The world the model was trained on stops matching the world the model has to predict. Google's textbook example is a model that estimates the probability a user will buy flowers. "Because of time pressure, the model is trained only once using a dataset of flower buying behavior during July and August. The model works fine for several months but then makes terrible predictions around Valentine's Day because user behavior during that floral holiday period changes dramatically." [2] The same pattern shows up with snow shovel sales, holiday travel, fashion trends, and almost anything where seasonality, news events, or new product launches matter.

For that reason, static training is a fit when:

- The distribution of inputs is genuinely stable, or at least stable over the horizon between retrains.
- The cost of being slightly out of date is small.
- Strict pre release verification matters more than freshness, for example in regulated domains or safety critical systems.

It is a bad fit when behavior is seasonal, when an adversary is actively trying to game the model (spam, fraud), or when the catalog of items the model has to score changes daily. In those cases a [dynamic model](/wiki/dynamic) with online or near online training tends to win, even though it costs more to run.

Even with static training, you still need to monitor the inputs at serving time. The model is frozen, but the data is not, and the cheapest sign that a static model has gone stale is a drift in the distribution of features it sees in production.

## What is static (offline, batch) inference?

Static inference is the prediction side of the same idea. Instead of running the model in response to a user request, you run it ahead of time on a known set of inputs, write the results to a key value store or a database, and then have the application look up the answer when it needs one. Google's glossary calls this "the process of a model generating a batch of predictions and then caching (saving) those predictions." [1] The closely related glossary entry for batch inference defines it as "the process of inferring predictions on multiple unlabeled examples divided into smaller subsets ('batches')" and notes that it "can take advantage of the parallelization features of accelerator chips," which is part of why precomputing answers offline is so cost efficient. [1]

The canonical example is a recommendation system. A streaming service can compute a top 100 list of likely titles for every active user every night, write those 100 IDs per user into a fast key value store, and serve them from cache when the user opens the app. The model itself never runs at request time. The user sees a sub 100 millisecond response because the answer was already there.

Netflix has described a hybrid version of this in detail. Its transformer based Foundation Model is pretrained from scratch about once a month and then fine tuned every day on the latest data; a daily batch inference job then refreshes the profile and item embeddings and publishes them to an Embedding Store, where downstream models read them as features or for candidate generation. [8] A smaller online model re ranks the precomputed candidates using the user's current session context. The heavy work is static; only the final personalization step is dynamic. This pattern is common across recommender systems, lead scoring pipelines, churn prediction for marketing, and any workload where the population of users or items to score is mostly known in advance.

### What are the advantages of static inference?

- Latency at serving time is essentially the latency of a cache lookup. The expensive model never runs on the request path.
- The cost of [inference](/wiki/inference) becomes a scheduling problem rather than a scaling problem. Google's crash course lists "don't need to worry much about cost of inference" as the first advantage of static inference. [3] You can run the batch on cheap, off peak compute and size for throughput instead of for tail latency.
- You can verify the predictions before exposing them. Google calls this the ability to "do post-verification of predictions before pushing." [3] That includes spot checking, fairness audits, content policy filters, and removing obvious mistakes. With dynamic inference there is no equivalent moment to inspect the output before users see it.
- Failure modes are easier to reason about. If the model misbehaves, the cache holds yesterday's predictions and the application keeps working; you can roll the cache back without redeploying a service.

### What are the limitations of static inference?

- You can only serve predictions you precomputed. As Google puts it, "the system might not be able to serve predictions for uncommon input examples." [3] If the input is unusual, for example a brand new user, a brand new product, or a query no one has run before, the cache has nothing for it. This is the long tail problem, and it is the reason pure static inference does not work for systems that have to handle arbitrary queries.
- Freshness is bounded by the batch cadence. Google notes that for static inference "update latency is likely measured in hours or days." [3] If the batch runs nightly, the predictions can be up to 24 hours old. For many systems that is fine. For ad ranking, fraud detection, or any system reacting to a news cycle, it is not.
- The space of inputs you have to enumerate can be huge. Computing predictions for 200 million users times 100 candidate items per user is 20 billion predictions per run, and that storage and compute budget has to be planned for.

Many teams fall back to a hybrid: use static inference for the common cases, and call the model dynamically for the long tail. That gives you the cost profile of batch with a safety net for inputs the batch did not anticipate.

## What is a static feature?

A related but narrower use of the word: a **static feature** is an input attribute that does not change, or changes only rarely, over the life of an example. Birth year, country of registration, manufacturer, and product category are static in a way that current location, recent click history, or live price are not. Static features are usually cheaper to store and less prone to leakage in training, because their values at inference time match the values that were available at training time. Many feature stores explicitly separate static (or slowly changing) features from streaming features for this reason.

## What is the difference between static and dynamic in machine learning?

| Dimension | Static (offline, batch) | Dynamic (online, on demand) |
| --- | --- | --- |
| Training | One fit on a fixed dataset, repeated on a schedule | Continuous or frequent updates as new data arrives |
| Inference | Precomputed predictions, served from cache | Model runs at request time |
| Operational cost | Lower; scheduled compute and simpler pipelines | Higher; always on training and serving infrastructure |
| Latency at serving | Cache lookup latency | Bounded by model forward pass and request handling |
| Freshness | Stale between batches (update latency in hours or days) | Reflects very recent data |
| Verification before deploy | Easy; the artifact is fixed | Harder; the model and the data are both moving |
| Long tail inputs | Poor; only cached cases are covered | Good; the model can score anything |
| Typical risks | Concept drift, seasonality, novelty | Training instability, feedback loops, drift in the loss |

Most systems do not sit cleanly on one side. The point is to make the trade off explicit: static buys you stability and lower running costs, at the price of freshness and coverage. Dynamic gives you freshness and coverage, at the price of complexity.

## How do you handle staleness and drift, and when should you retrain?

The central failure mode of static training is **[concept drift](/wiki/concept_drift)**: the statistical relationship between inputs and the label changes over time, so a model that was accurate last quarter starts making worse predictions now. Drift can be gradual (slow shifts in user behavior), seasonal (the flower example), or sudden (a new competitor product, a policy change, a viral news story).

The usual operational response to drift in a static setup is some combination of:

- Monitoring input distributions and prediction distributions in production, and alerting when they diverge from the training data.
- Tracking a holdout of labeled production data so the team can compute live accuracy or AUC and watch it trend.
- Scheduling regular retrains, weekly or monthly being common defaults, and treating each retrain as a small release with its own evaluation and rollback plan.
- Adding features that explicitly encode the time of year, the day of week, or other cyclic signals, so the static model does not have to re learn seasonality every cycle.

This is where most production ML lives. The model is technically static between releases, but the system around it is doing a lot of dynamic work to keep it honest. When the cost of that surveillance rivals the cost of just retraining online, teams usually flip to a [dynamic model](/wiki/dynamic).

## When should you choose static?

Google's crash course gives a clean default: "If your dataset truly isn't changing over time, choose static training because it is cheaper to create and maintain than dynamic training." [2] The same logic applies to inference. If the set of inputs is small, stable, and known in advance, precompute the answers and serve from cache.

Good candidates for a fully static stack:

- Demographic or actuarial models where behavior changes slowly and audits are common.
- Lead scoring batches that only need to run overnight to feed a sales team the next morning.
- Recommendation backbones where the catalog turns over slowly and a thin online layer handles personalization.
- Content moderation classifiers retrained quarterly as new policy categories appear.

Good candidates for going dynamic instead:

- Adversarial inputs (fraud, spam, abuse), where the distribution shifts because the other side is reacting to your model.
- Real time bidding, ranking, and search, where freshness is the product.
- News feeds, where the catalog of items barely existed an hour ago.
- Systems where novel inputs are common and a cache miss is unacceptable.

Most production systems are a mix. Static where you can, dynamic where you must.

## Explain like I'm 5 (ELI5)

Imagine the cafeteria is going to serve lunch to the whole school. There are two ways to do it. One way is to figure out yesterday what every kid likes and put their tray together in advance, so when they walk up the food is already there. That is static. It is fast at lunch time, but if a new kid shows up, or someone changes their mind, the tray is wrong. The other way is to wait until each kid is at the counter and make their plate then. That is dynamic. It handles surprises, but the line moves slower and the cafeteria has to be busy the whole time. Static machine learning does the cafeteria's prep work the night before: train the model once, or precompute the answers once, and serve them out of a cache the next day.

## References

1. Google for Developers, Machine Learning Glossary. https://developers.google.com/machine-learning/glossary
2. Google for Developers, Machine Learning Crash Course, Production ML systems: Static versus dynamic training. https://developers.google.com/machine-learning/crash-course/production-ml-systems/static-vs-dynamic-training
3. Google for Developers, Machine Learning Crash Course, Production ML systems: Static versus dynamic inference. https://developers.google.com/machine-learning/crash-course/production-ml-systems/static-vs-dynamic-inference
4. Google for Developers, Machine Learning Glossary: ML Fundamentals. https://developers.google.com/machine-learning/glossary/fundamentals
5. Wikipedia, Concept drift. https://en.wikipedia.org/wiki/Concept_drift
6. IBM, What is model drift? https://www.ibm.com/think/topics/model-drift
7. Google Cloud, What is batch inference? https://cloud.google.com/discover/what-is-batch-inference
8. Netflix Technology Blog, Integrating Netflix's Foundation Model into Personalization applications. https://netflixtechblog.medium.com/integrating-netflixs-foundation-model-into-personalization-applications-cf176b5860eb