Static
Last reviewed
May 11, 2026
Sources
8 citations
Review status
Source-backed
Revision
v2 · 2,203 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
May 11, 2026
Sources
8 citations
Review status
Source-backed
Revision
v2 · 2,203 words
Add missing citations, update stale details, or suggest a clearer explanation.
See also: Machine learning terms
In machine learning, the word static almost always means offline, or batch. Google's Machine Learning Glossary uses it as a synonym for offline in three closely related places: a static model is one that is trained offline and does not change; static training is the act of training that model once on a fixed dataset; and static inference (also called offline inference or batch inference) is the practice of generating predictions ahead of time and caching them for later lookup. The opposite term in every case is dynamic, which covers continuously retrained models, online training, and on demand inference.
The label is useful because production ML systems must answer two separate questions about the time dimension: when does the model learn, and when does it produce predictions. Static answers "in advance" for both. Dynamic answers "as data arrives" or "as requests arrive." Most real systems mix the two, but the static side of the line gets a lot of work done at lower operational cost, which is why it remains the default starting point for many teams.
Static training, also called offline training, means the model is fit once on a fixed snapshot of data and then deployed. There is no online update loop. If the team wants the model to reflect newer data, someone schedules a fresh training run, often weekly or monthly, and replaces the deployed artifact. Between those runs the parameters do not move.
Google's own framing of this is blunt: "you train a model only once. You then serve that same trained model for a while."
The advantages of static training are mostly practical:
The trap is staleness. The world the model was trained on stops matching the world the model has to predict. Google's textbook example is a model that estimates the probability a user will buy flowers, trained only on data from July and August. The model behaves reasonably for several months and then makes terrible predictions in February, because Valentine's Day flips user behavior in a way the training data never saw. The same pattern shows up with snow shovel sales, holiday travel, fashion trends, and almost anything where seasonality, news events, or new product launches matter.
For that reason, static training is a fit when:
It is a bad fit when behavior is seasonal, when an adversary is actively trying to game the model (spam, fraud), or when the catalog of items the model has to score changes daily. In those cases a dynamic model with online or near online training tends to win, even though it costs more to run.
Even with static training, you still need to monitor the inputs at serving time. The model is frozen, but the data is not, and the cheapest sign that a static model has gone stale is a drift in the distribution of features it sees in production.
Static inference is the prediction side of the same idea. Instead of running the model in response to a user request, you run it ahead of time on a known set of inputs, write the results to a key value store or a database, and then have the application look up the answer when it needs one. Google's glossary calls this "the process of a model generating a batch of predictions and then caching (saving) those predictions."
The canonical example is a recommendation system. A streaming service can compute a top 100 list of likely titles for every active user every night, write those 100 IDs per user into a fast key value store, and serve them from cache when the user opens the app. The model itself never runs at request time. The user sees a sub 100 millisecond response because the answer was already there.
Netflix has described a hybrid version of this in detail. Candidate items are generated in batch, embeddings are refreshed by nightly batch inference jobs, and then a smaller online model re ranks the precomputed candidates using the user's current session context. The heavy work is static; only the final personalization step is dynamic. This pattern is common across recommender systems, lead scoring pipelines, churn prediction for marketing, and any workload where the population of users or items to score is mostly known in advance.
Many teams fall back to a hybrid: use static inference for the common cases, and call the model dynamically for the long tail. That gives you the cost profile of batch with a safety net for inputs the batch did not anticipate.
A related but narrower use of the word: a static feature is an input attribute that does not change, or changes only rarely, over the life of an example. Birth year, country of registration, manufacturer, and product category are static in a way that current location, recent click history, or live price are not. Static features are usually cheaper to store and less prone to leakage in training, because their values at inference time match the values that were available at training time. Many feature stores explicitly separate static (or slowly changing) features from streaming features for this reason.
| Dimension | Static (offline, batch) | Dynamic (online, on demand) |
|---|---|---|
| Training | One fit on a fixed dataset, repeated on a schedule | Continuous or frequent updates as new data arrives |
| Inference | Precomputed predictions, served from cache | Model runs at request time |
| Operational cost | Lower; scheduled compute and simpler pipelines | Higher; always on training and serving infrastructure |
| Latency at serving | Cache lookup latency | Bounded by model forward pass and request handling |
| Freshness | Stale between batches | Reflects very recent data |
| Verification before deploy | Easy; the artifact is fixed | Harder; the model and the data are both moving |
| Long tail inputs | Poor; only cached cases are covered | Good; the model can score anything |
| Typical risks | Concept drift, seasonality, novelty | Training instability, feedback loops, drift in the loss |
Most systems do not sit cleanly on one side. The point is to make the trade off explicit: static buys you stability and lower running costs, at the price of freshness and coverage. Dynamic gives you freshness and coverage, at the price of complexity.
The central failure mode of static training is concept drift: the statistical relationship between inputs and the label changes over time, so a model that was accurate last quarter starts making worse predictions now. Drift can be gradual (slow shifts in user behavior), seasonal (the flower example), or sudden (a new competitor product, a policy change, a viral news story).
The usual operational response to drift in a static setup is some combination of:
This is where most production ML lives. The model is technically static between releases, but the system around it is doing a lot of dynamic work to keep it honest. When the cost of that surveillance rivals the cost of just retraining online, teams usually flip to a dynamic model.
Google's crash course gives a clean default: if your dataset truly is not changing over time, choose static training because it is cheaper to create and maintain than dynamic training. The same logic applies to inference. If the set of inputs is small, stable, and known in advance, precompute the answers and serve from cache.
Good candidates for a fully static stack:
Good candidates for going dynamic instead:
Most production systems are a mix. Static where you can, dynamic where you must.
Imagine the cafeteria is going to serve lunch to the whole school. There are two ways to do it. One way is to figure out yesterday what every kid likes and put their tray together in advance, so when they walk up the food is already there. That is static. It is fast at lunch time, but if a new kid shows up, or someone changes their mind, the tray is wrong. The other way is to wait until each kid is at the counter and make their plate then. That is dynamic. It handles surprises, but the line moves slower and the cafeteria has to be busy the whole time. Static machine learning does the cafeteria's prep work the night before: train the model once, or precompute the answers once, and serve them out of a cache the next day.