# Summary

> Source: https://aiwiki.ai/wiki/summary
> Updated: 2026-05-16
> Categories: Machine Learning
> From AI Wiki (https://aiwiki.ai), a free encyclopedia of artificial intelligence. Quote with attribution.

*See also: [TensorFlow](/wiki/tensorflow), [TensorBoard](/wiki/tensorboard), [Machine learning terms](/wiki/machine_learning_terms)*

In [TensorFlow](/wiki/tensorflow), a **summary** is a piece of data written to disk during training so that it can later be visualized in [TensorBoard](/wiki/tensorboard). The `tf.summary` module is the API used to record those values. A scalar loss, the distribution of weights in a layer, a batch of generated images, a snippet of synthesized audio, or a block of model configuration text can all be turned into summaries and streamed to an event file as training proceeds. TensorBoard then reads that event file and renders the data as interactive charts, image grids, audio players, histograms, and graphs.

The word "summary" in this context is narrow and technical. It does not refer to text summarization or to a printout of a model with `model.summary()`. It refers to a serialized record, written by `tf.summary.scalar`, `tf.summary.histogram`, `tf.summary.image`, `tf.summary.audio`, `tf.summary.text`, or related functions, that TensorBoard knows how to display.

## Purpose

Training a neural network can take hours or days, and most of the interesting behavior lives inside the loop: how the loss curves bend, whether gradients explode, what the model is generating on the validation set, how learning rates decay. Reading numbers from `print` calls is fine for toy problems and miserable for anything else. Summaries solve this by giving you a structured, time stamped log that a separate visualization tool can render while the run is still in progress.

A few of the things teams use `tf.summary` for in practice:

* Tracking training and validation loss, accuracy, and other scalar metrics over steps or epochs.
* Watching weight, bias, gradient, and activation distributions evolve, which makes vanishing or exploding gradients visible early.
* Inspecting predicted images from a generative model side by side with real samples after each epoch.
* Logging audio samples produced by a text to speech or music model so a human can listen to progress.
* Recording model configuration, hyperparameters, and run notes as markdown so the run is self documenting.
* Profiling computation graphs and op level execution to find slow kernels and device transfers.
* Visualising attention maps, segmentation masks, or feature heatmaps alongside the underlying input so model behaviour can be inspected qualitatively.
* Storing 3D point clouds, meshes, and embedding projections that go beyond what a single line chart can communicate.

The broader pattern that `tf.summary` belongs to is sometimes called **experiment instrumentation**. The training script emits structured events; a separate viewer process subscribes to those events. Coupling is loose, the writer never has to know about the dashboard, and the dashboard never has to interrupt training. That separation is what makes the workflow scalable to long jobs on remote machines.

## Core api

The `tf.summary` module in TensorFlow 2 exposes a small set of writer functions. Each one accepts a `name`, the `data` to record, and an integer `step` that places the value in time. The exact signature of the scalar writer is `tf.summary.scalar(name, data, step=None, description=None)`, and the other writers follow the same shape.

| Function | Signature | What it writes |
|---|---|---|
| `tf.summary.scalar` | `(name, data, step=None, description=None)` | A single floating point number per call. Used for loss, accuracy, learning rate, gradient norm. |
| `tf.summary.histogram` | `(name, data, step=None, buckets=None, description=None)` | A tensor binned into a histogram for distribution analysis of weights, biases, gradients, activations. |
| `tf.summary.image` | `(name, data, step=None, max_outputs=3, description=None)` | One or more images, shaped `[k, h, w, c]` with `c` of 1, 3, or 4. |
| `tf.summary.audio` | `(name, data, sample_rate, step=None, max_outputs=3, encoding=None, description=None)` | One or more audio clips with a sample rate, optionally encoded as `wav` or `mp3`. |
| `tf.summary.text` | `(name, data, step=None, description=None)` | A string or string tensor, rendered as markdown in TensorBoard. |
| `tf.summary.write` | `(tag, tensor, metadata=None, name=None, step=None)` | Low level escape hatch for writing an arbitrary tensor and SummaryMetadata. |
| `tf.summary.create_file_writer` | `(logdir, max_queue=None, flush_millis=None, filename_suffix=None, name=None, experimental_trackable=False)` | Returns a `SummaryWriter` bound to a log directory. |
| `tf.summary.create_noop_writer` | `()` | Returns a writer that drops everything, useful for disabling logging from a single replica. |
| `tf.summary.flush` | `(writer=None, name=None)` | Forces buffered events to disk. |
| `tf.summary.record_if` | `(condition)` | Context manager that gates whether subsequent ops record. |
| `tf.summary.should_record_summaries` | `()` | Returns the current value of the recording condition as a `tf.bool` tensor. |
| `tf.summary.trace_on` / `trace_off` / `trace_export` | `trace_on(graph=True, profiler=False)` and friends | Captures the computational graph and profiling trace from a single `tf.function` call. |

Each writer function returns a Python `bool` (in eager mode) or a `tf.bool` tensor (inside `tf.function`) that is `True` if the summary was actually recorded. The boolean lets you wire conditional logic into a training loop without checking the active writer state by hand.

There are also helpers under `tf.summary.experimental` for setting a default step so that you do not have to thread it through every call, plus `tf.summary.experimental.write_raw_pb` for shipping a pre-built protobuf blob.

### Implicit step handling

Threading a `step` argument through every summary call gets noisy. The experimental step helpers exist to handle that:

```python
tf.summary.experimental.set_step(step)
```

Once set, subsequent calls to `tf.summary.scalar`, `histogram`, `image`, and friends will use the stored step if `step` is left as `None`. Inside a `tf.function`, the helper integrates with the autograph capture machinery so that an outer Python variable can be used as the step, even though the function is running in graph mode.

## Writers and event files

Nothing is written to disk until you create a `SummaryWriter`. `tf.summary.create_file_writer(logdir)` returns a writer that opens an append only `tfevents` file inside `logdir`. Summary ops do not pick a writer implicitly; they look up the writer that is currently active via the `as_default()` context manager. The pattern looks like this:

```python
import tensorflow as tf

writer = tf.summary.create_file_writer("logs/run-01")

with writer.as_default():
    for step in range(num_steps):
        loss = train_one_batch()
        tf.summary.scalar("loss", loss, step=step)
        if step % 100 == 0:
            tf.summary.histogram("layer1/weights", model.layers[0].kernel, step=step)
            tf.summary.image("samples", generate_samples(), step=step, max_outputs=4)
```

The writer batches events in memory and flushes them to the file periodically. The default flush interval is 120 seconds, controlled by the `flush_millis` argument. The `max_queue` argument controls how many events can pile up in memory between flushes (default 10), and `filename_suffix` lets you add a tag to the event file name so multiple runs in the same directory can be distinguished by inspection. Calling `writer.flush()` or `tf.summary.flush()` forces an immediate write, which is useful at the end of training, around checkpoints, or before exiting after an exception.

Different runs should go to different log directories so that TensorBoard can show them as separate experiments and let you toggle them on and off. A common convention is `logs/<experiment-name>/<timestamp>`. Splitting training and validation into sibling directories such as `logs/run-01/train` and `logs/run-01/val` lets TensorBoard overlay the two curves on the same chart.

### Event file format on disk

The file written by `create_file_writer` follows TensorFlow's `tfevents` record format. Each record is a length prefixed `Event` protobuf, defined in `tensorflow/core/util/event.proto`. The first record in every file is a `file_version` event with the value `"brain.Event:2"`. Subsequent records can carry a wall time, a step, and one of several payload types: a `Summary` protobuf with one or more `Value` entries, a `LogMessage`, a `SessionLog`, or a `TaggedRunMetadata` for profiling.

The filename itself follows the pattern `events.out.tfevents.<unix-seconds>.<host>.<pid>.<suffix>.v2`. The pieces are useful for ad hoc debugging. The unix timestamp makes it easy to sort runs by start time, the hostname identifies which machine wrote the file in a distributed job, and the `v2` marker distinguishes the modern event format used since TensorFlow 2.

Files are append only. The writer does not rewrite earlier events, which is what allows TensorBoard to tail a file safely while training is still running. The drawback is that a long run with very high frequency logging can produce gigabyte sized event files, so most teams cap the sampling rate of heavyweight payloads such as images and audio.

### Reading events programmatically

The same event files can be parsed without launching TensorBoard. The `tf.compat.v1.train.summary_iterator(path)` generator walks a file and yields `Event` objects, which is enough to extract scalars, images, and tensors into pandas frames or numpy arrays. TensorBoard 2.3 added a more ergonomic `tensorboard.data.experimental.ExperimentFromDev` API that returns scalar runs as a pandas DataFrame with `run`, `tag`, `step`, and `value` columns. Setting `pivot=True` returns a wide form frame with tags as columns, which is the easiest input for downstream statistical analysis.

## Tf 1 versus tf 2

`tf.summary` was redesigned for TensorFlow 2. In TensorFlow 1, summary ops produced protocol buffer tensors that had to be fetched through `Session.run`, aggregated with `tf.summary.merge_all`, and written manually with a separate `FileWriter`. That two stage flow assumed a static graph and did not fit eager execution. In TensorFlow 2, the writer is part of the execution context, summary ops write directly when they run, and the global step is passed explicitly to every call instead of being managed by a hidden collection.

The second positional argument was also renamed from `tensor` to `data`, and the `collections` and `family` keyword arguments were removed. Old TensorFlow 1 code can still run by importing `tf.compat.v1.summary`, but the official guidance is to migrate. Most non trivial Keras and custom training loops written after 2019 use the TensorFlow 2 style.

The table below summarises the practical differences a code reviewer will run into when porting a script.

| Aspect | TF 1.x | TF 2.x |
|---|---|---|
| Writer class | `tf.summary.FileWriter` | `tf.summary.SummaryWriter` via `create_file_writer` |
| Activation | Manual `FileWriter.add_summary(buffer, step)` | Implicit via `with writer.as_default():` context |
| Aggregation | `tf.summary.merge_all()` over graph collections | Removed, no longer needed |
| Second positional arg | `tensor` | `data` |
| Global step | Implicit via `tf.train.get_or_create_global_step` | Explicit `step` argument on each call |
| Conditional logging | `if condition: ...` outside the graph | `with tf.summary.record_if(condition):` |
| Return value of op | A serialized `Summary` proto tensor | `True` if recorded, `False` otherwise |
| Removed keywords | n/a | `family`, `collections` deleted |
| Backwards compatibility | n/a | `tf.compat.v1.summary` retains the old API |

TensorFlow shipped an automated migration script, `tf_upgrade_v2`, that rewrites a v1 codebase to either the v2 API or the `tf.compat.v1.summary` shim. Most non trivial training scripts still need a manual pass because the v1 pattern of fetching merged summary ops through a session does not map one to one onto the v2 writer model.

## Using summaries from keras

If you train with `model.fit`, the easiest way to get summaries is the built in `tf.keras.callbacks.TensorBoard` callback. Adding it to the callback list opens a writer, records the loss and metric for each epoch, optionally logs weight histograms with `histogram_freq`, and can capture profiling data and embedding projections.

```python
tb = tf.keras.callbacks.TensorBoard(
    log_dir="logs/fit",
    histogram_freq=1,
    write_graph=True,
    profile_batch="500,520",
)
model.fit(x_train, y_train, epochs=10, callbacks=[tb])
```

The callback writes to one directory per `fit` call. Custom metrics, image grids, or other ad hoc summaries can be added inside a custom callback by opening a writer in `on_train_begin` and calling `tf.summary.scalar` or `tf.summary.image` in `on_epoch_end`.

### Callback parameters in full

The Keras callback exposes more knobs than the typical example shows. The table below lists every parameter and its default value, drawn from the current `tf.keras.callbacks.TensorBoard` documentation.

| Parameter | Default | Effect |
|---|---|---|
| `log_dir` | `'./logs'` | Directory where event files are written. Pass a timestamped subdirectory per run. |
| `histogram_freq` | `0` | Compute weight and activation histograms every N epochs. `0` disables them. |
| `write_graph` | `True` | Write the model graph to the event file. Adds visible structure to the Graphs dashboard, can produce large files. |
| `write_images` | `False` | Render model weights as image tiles. Useful for visualising convolutional filters. |
| `write_steps_per_second` | `False` | Emit a scalar of training throughput per step. |
| `update_freq` | `'epoch'` | Either `'epoch'`, `'batch'`, or an integer batch count. Controls how often scalars are written. |
| `profile_batch` | `'500,520'` | Range of batches to profile. Pass `0` to disable. |
| `embeddings_freq` | `0` | Frequency (in epochs) at which embedding layers are exported for the Projector dashboard. |
| `embeddings_metadata` | `None` | Optional mapping from embedding layer name to a metadata file (usually a TSV of labels). |

For custom logging inside the same callback hierarchy, the documented pattern is to call `tf.summary.create_file_writer` once, store the writer on `self`, and emit values from inside `on_epoch_end` or `on_batch_end`. This composes cleanly with the built in callback because both writers can target the same `log_dir` and TensorBoard will merge their events under the same run.

### A custom training loop with gradient tape

Outside Keras, a typical training loop with `tf.GradientTape` looks like the snippet below. The pattern is the same as before: open a writer, enter its default context, log scalars per step and heavier payloads on a slower schedule.

```python
train_writer = tf.summary.create_file_writer("logs/run-01/train")
val_writer = tf.summary.create_file_writer("logs/run-01/val")

for epoch in range(epochs):
    for step, (x, y) in enumerate(train_ds):
        with tf.GradientTape() as tape:
            preds = model(x, training=True)
            loss = loss_fn(y, preds)
        grads = tape.gradient(loss, model.trainable_variables)
        optimizer.apply_gradients(zip(grads, model.trainable_variables))

        with train_writer.as_default():
            tf.summary.scalar("loss", loss, step=optimizer.iterations)

    val_loss = evaluate(model, val_ds)
    with val_writer.as_default():
        tf.summary.scalar("loss", val_loss, step=epoch)
```

Keeping the two writers in sibling directories is what lets TensorBoard render training and validation loss on the same chart automatically.

## Tensorboard

[TensorBoard](/wiki/tensorboard) is the companion tool that reads event files and renders them. Starting it is a one line command:

```
tensorboard --logdir logs
```

It then serves a local web app, by default on port 6006, that scans the directory tree for `tfevents` files and groups them by run. The dashboards mirror the writer functions: scalars and time series for `tf.summary.scalar`, distributions and histograms for `tf.summary.histogram`, an image gallery for `tf.summary.image`, an audio player for `tf.summary.audio`, a markdown view for `tf.summary.text`, a graph viewer for traced `tf.function` calls, and a profiler for runs that captured profile data.

TensorBoard reads the log directory continuously, so summaries written during a long training run appear in the UI within seconds. Multiple runs in the same parent directory are rendered as overlaid lines on every scalar chart, which is how you compare hyperparameter sweeps visually.

The table below summarises the dashboards a vanilla TensorBoard install exposes.

| Dashboard | Backed by | What you see |
|---|---|---|
| Time Series / Scalars | `tf.summary.scalar` | Line charts of metric vs step or epoch, with per run colours. |
| Histograms | `tf.summary.histogram` | Overlapping distribution curves stacked along the step axis. |
| Distributions | `tf.summary.histogram` | Percentile bands of the same data, easier to read than raw histograms. |
| Images | `tf.summary.image` | Grid of recorded image tensors with a step slider. |
| Audio | `tf.summary.audio` | Embedded HTML audio players. |
| Text | `tf.summary.text` | Rendered markdown blocks per step. |
| Graphs | `tf.summary.trace_on` / `trace_export`, Keras `write_graph` | Op graph viewer with namespace collapse. |
| Projector | `embeddings_freq` callback or `projector.visualize_embeddings` | 2D / 3D embedding viewer with PCA, t-SNE, UMAP. |
| HParams | `hp.hparams_config`, `hp.hparams` | Table, parallel coordinates, and scatter views of hyperparameter sweeps. |
| Profile | `tf.summary.trace_on(profiler=True)` or Keras `profile_batch` | Step time graph, trace viewer, input pipeline analyzer. |
| Mesh | `mesh_summary.op` from the mesh plugin | Interactive 3D point clouds and triangle meshes. |

For Jupyter and Colab users, TensorBoard exposes line magics: `%load_ext tensorboard` followed by `%tensorboard --logdir logs` opens the dashboard inside a notebook cell.

### Tracing graphs

The graph viewer in TensorBoard does not get populated automatically once `tf.function` is used. You have to trace the function explicitly:

```python
@tf.function
def step(x, y):
    return loss_fn(y, model(x))

tf.summary.trace_on(graph=True, profiler=True)
step(sample_x, sample_y)
with writer.as_default():
    tf.summary.trace_export(name="step", step=0, profiler_outdir="logs/profile")
```

The constraint to be aware of is that exactly one `tf.function` call should happen between `trace_on` and `trace_export`, otherwise the resulting graph will be a mash up of every traced function. The recommended pattern is to call `trace_off` at the end of the block, or to do tracing in a one off script outside the main training loop.

### Hyperparameter sweeps and the hparams plugin

The `tensorboard.plugins.hparams.api` module sits on top of `tf.summary` and adds first class support for hyperparameter sweeps. It defines three building blocks: `HParam` for a tunable parameter, domain helpers such as `Discrete`, `RealInterval`, and `IntInterval`, and `Metric` for an outcome to optimise. A sweep looks like this:

```python
from tensorboard.plugins.hparams import api as hp

HP_UNITS = hp.HParam("num_units", hp.Discrete([16, 32]))
HP_LR = hp.HParam("learning_rate", hp.RealInterval(1e-4, 1e-2))
METRIC_ACC = "accuracy"

with tf.summary.create_file_writer("logs/sweep").as_default():
    hp.hparams_config(
        hparams=[HP_UNITS, HP_LR],
        metrics=[hp.Metric(METRIC_ACC, display_name="Accuracy")],
    )

def trial(run_dir, hparams):
    with tf.summary.create_file_writer(run_dir).as_default():
        hp.hparams(hparams)
        acc = train_eval(hparams)
        tf.summary.scalar(METRIC_ACC, acc, step=1)
```

TensorBoard then renders the sweep in three views: a sortable table, a parallel coordinates plot for spotting clusters, and a scatter view for correlating any hyperparameter with any metric.

## Profiler integration

The TensorFlow Profiler is itself a TensorBoard plugin, and its data is delivered through the same event file pipeline as summaries. There are two main entry points. The Keras callback's `profile_batch` argument captures a range of batches automatically. The lower level `tf.summary.trace_on(profiler=True)` and `tf.summary.trace_export(profiler_outdir=...)` API captures a single trace from a `tf.function` call.

The Profile tab adds several focused dashboards on top of the data:

* **Overview Page**, with a step time breakdown into compute, input, and idle time. The header bar flags whether the step is compute bound or input bound.
* **Trace Viewer**, a Chrome trace style timeline showing host CPU threads, device streams, kernels, and memory copies. Keyboard shortcuts `W` and `S` zoom, `A` and `D` pan, `M` measures intervals.
* **Input Pipeline Analyzer**, which highlights `tf.data` stages that are bottlenecks.
* **TensorFlow Stats**, an op by op table of cumulative time and self time.
* **Memory Profile**, a timeline of allocator activity per device.

The canonical use of the profiler is to confirm whether a slow training loop is GPU bound, CPU bound, or input bound. Adding `cache()` and `prefetch()` to the `tf.data` pipeline is the typical fix once the trace viewer shows long idle gaps on the device stream.

## Embeddings projector

The Embeddings Projector visualises high dimensional vectors in 2D or 3D. There are two ways to feed it. The Keras callback exposes `embeddings_freq` and `embeddings_metadata`, which exports the weights of any `tf.keras.layers.Embedding` layer along with an optional TSV of labels. The lower level path is to write a `projector_config.pbtxt` next to the event files and call `projector.visualize_embeddings(logdir, config)` from `tensorboard.plugins.projector`. Either way, the projector exposes PCA, t-SNE, and UMAP projections, a search box for label lookup, and a nearest neighbour view that highlights similar vectors. It is especially useful for inspecting learned word embeddings, contrastive learning representations, and the latent space of an autoencoder.

## Mesh and 3D summaries

The `mesh` plugin ships with TensorBoard and accepts 3D data as tensors of vertex coordinates, optional per vertex colours, and optional triangle face indices. The summary is created with `mesh_summary.op` from `tensorboard.plugins.mesh.summary`, which writes a payload that the dashboard interprets as a renderable scene. A `config_dict` argument is forwarded to THREE.js so the viewer can be customised with camera, lighting, and material settings. The plugin is widely used in 3D reconstruction work and point cloud segmentation, where comparing a predicted mesh to the ground truth is far more informative than any scalar metric.

## Distributed training caveats

Distributed training adds wrinkles that the single GPU examples do not surface. The first is **per replica writers**. Under `tf.distribute.MirroredStrategy`, every replica runs the model code. If every replica calls `tf.summary.scalar`, you end up with N copies of the same value at every step, which double counts in TensorBoard. The recommended pattern is to wrap the summary call in a check for the chief replica, for example by using `tf.summary.create_noop_writer()` on non chief replicas. The Keras callback handles this automatically.

The second is **summary calls inside `tf.function`**. Default writers do not cross `tf.function` boundaries the way Python state does. Inside a `tf.function`, the active writer is resolved at trace time, so calling `with writer.as_default()` inside the function is safer than relying on an outer context. The companion helper `tf.summary.experimental.set_step` makes the step available inside the trace.

The third is **TPU constraints**. TPU runs flush less aggressively than GPU runs because of how the host coordinates with the device. Explicit `writer.flush()` calls at the end of each epoch make the difference between seeing your scalars in TensorBoard in real time and waiting until training finishes.

The fourth is **multi worker logging**. With `MultiWorkerMirroredStrategy`, every worker has its own log directory or every worker has to write into a shared filesystem. Writing to the same file from multiple machines is unsupported; the standard advice is to give each worker its own subdirectory keyed on `task_id`, then let TensorBoard merge them.

## Using summaries from keras callbacks for ad hoc metrics

Not every metric fits neatly into `compile(metrics=[...])`. Confusion matrices, per class precision and recall curves, and qualitative samples often need a custom callback. The pattern is the same as before. Open a writer in `on_train_begin`, write scalars or images in `on_epoch_end`, and flush in `on_train_end`.

```python
class SampleImageCallback(tf.keras.callbacks.Callback):
    def __init__(self, logdir, sample_inputs):
        self.writer = tf.summary.create_file_writer(logdir)
        self.sample_inputs = sample_inputs

    def on_epoch_end(self, epoch, logs=None):
        preds = self.model(self.sample_inputs)
        with self.writer.as_default():
            tf.summary.image("preds", preds, step=epoch, max_outputs=4)

    def on_train_end(self, logs=None):
        self.writer.flush()
```

This is also the recommended way to log matplotlib figures. The standard helper from the documentation renders a figure to an in memory PNG, decodes it with `tf.image.decode_png`, and adds a batch dimension before calling `tf.summary.image`. Confusion matrices, ROC curves, and learning rate schedules are typically delivered this way.

## Pytorch and the same idea

The same logging pattern exists outside TensorFlow. [PyTorch](/wiki/pytorch) ships `torch.utils.tensorboard.SummaryWriter`, which writes the same `tfevents` format that TensorBoard understands. The method names are slightly different but the mental model matches:

| PyTorch method | TensorFlow analogue |
|---|---|
| `add_scalar` | `tf.summary.scalar` |
| `add_scalars` | Multiple `tf.summary.scalar` calls under one tag |
| `add_histogram` | `tf.summary.histogram` |
| `add_image`, `add_images` | `tf.summary.image` |
| `add_audio` | `tf.summary.audio` |
| `add_text` | `tf.summary.text` |
| `add_graph` | `tf.summary.trace_on` and `trace_export` |
| `add_embedding` | Embedding projector via `tf.summary` plugin |
| `add_pr_curve`, `add_hparams`, `add_mesh`, `add_figure`, `add_video` | TensorBoard plugins |

A typical PyTorch training loop with logging looks like this:

```python
from torch.utils.tensorboard import SummaryWriter

writer = SummaryWriter("logs/run-01")
for step, (x, y) in enumerate(loader):
    loss = train_step(x, y)
    writer.add_scalar("loss", loss.item(), step)
    if step % 100 == 0:
        writer.add_histogram("layer1/weights", model.layer1.weight, step)
writer.close()
```

The writer flushes asynchronously, the same `tensorboard --logdir logs` command picks up the events, and the dashboards behave the same. Teams that mix frameworks often standardize on TensorBoard precisely because both PyTorch and TensorFlow can target it.

## Alternatives

`tf.summary` and TensorBoard are not the only way to log experiments. The space of experiment tracking tools has expanded substantially since 2018, and most teams running large training jobs end up combining a deep learning specific viewer with a higher level run manager. The most commonly cited alternatives are listed below.

| Tool | Type | What it adds on top of `tf.summary` |
|---|---|---|
| [Weights & Biases](/wiki/weights_and_biases) | Hosted SaaS | `wandb.log({...})` API, automatic system metrics, hosted dashboards, hyperparameter sweeps, model registry, reports. Can auto sync existing `tfevents` files with `wandb.init(sync_tensorboard=True)`. |
| [MLflow](/wiki/mlflow) | Open source, optional cloud | `log_param`, `log_metric`, `log_artifact`, model registry, project packaging, autolog for popular libraries. |
| Neptune | Hosted SaaS | Run metadata at scale, organisation features, code and dataset versioning, integration with the popular DL frameworks. |
| Aim | Open source, self hosted | High performance metric store, pythonic search SDK, dashboard with run comparison and aggregations. Used at Meta, Amazon, Microsoft. |
| ClearML | Open source plus hosted tier | Experiment tracking, agent based orchestration, dataset management, hyperparameter tuning, model registry. |
| Comet | Hosted SaaS | Experiment tracking, model production monitoring, LLM specific dashboards, automatic source code logging. |
| Sacred + Omniboard | Open source | Lightweight Python decorator API plus MongoDB backed UI. Popular in academic labs. |
| Guild AI | Open source CLI | File based experiment tracking, hyperparameter search, no server. |
| DVC + Iterative Studio | Open source plus hosted | Data and model versioning with optional metric dashboards layered on git. |

**[MLflow](/wiki/mlflow)** is a framework agnostic experiment tracker. Calls such as `mlflow.log_param`, `mlflow.log_metric`, and `mlflow.log_artifact` record hyperparameters, scalar metrics, and arbitrary files inside a run. MLflow leans toward run management, model registry, and reproducibility rather than rich visualization, and it can ingest TensorBoard event files as artifacts when you want both. Autolog support exists for Scikit-learn, XGBoost, PyTorch, Keras, Spark, and several other libraries, so a single call to `mlflow.autolog()` captures parameters and metrics without explicit logging code.

**[Weights & Biases](/wiki/weights_and_biases)** (`wandb`) is a hosted experiment tracking platform with a similar logging API: `wandb.log({"loss": loss, "accuracy": acc})` for scalars, `wandb.Image` for images, `wandb.Histogram` for distributions, and `wandb.watch(model)` to log gradient and parameter histograms automatically. It adds collaboration features such as shared dashboards, sweep automation, and report writing on top of the basic logging. The TensorBoard integration is fully automated: passing `sync_tensorboard=True` to `wandb.init` causes the agent to mirror every event file write into the W&B cloud, where the same scalars, histograms, and images appear alongside W&B native logs.

**Aim** focuses on speed and self hosting. Its UI handles hundreds of thousands of metric sequences, and the SDK exposes a pythonic query language for filtering runs without leaving the notebook. Teams that already have an internal artifact store and just want a metric dashboard sometimes prefer Aim to a hosted SaaS because the data stays on their infrastructure.

**Neptune** sits between MLflow and W&B in feature surface. It targets organisations that need access controls, project level governance, and integration with both DL and classical ML pipelines. The Python API matches the same `log_metric` and `log_artifact` shape, and Neptune mirrors `tfevents` data through a similar bridge to the one W&B offers.

**ClearML** is open source with a hosted tier. Its strongest feature is the agent based orchestration on top of experiment tracking. A user submits a training script to a remote agent, the agent reproduces the environment, runs the job, and ships back metrics and artifacts. The metric API itself looks much like the others.

Many teams use more than one of these together. A common pattern is to use `tf.summary` or the PyTorch `SummaryWriter` for low level deep learning visualization, MLflow for run cataloging and model registry, and `wandb` for collaboration and hyperparameter sweeps. The data being logged is largely the same; the differences are storage, UI, and what surrounds the raw metrics.

## Practical notes

A few things worth knowing once you start using `tf.summary` in real training jobs:

* Summaries written inside a `tf.function` need the writer's `as_default()` context inside the function, since default writers do not cross `tf.function` boundaries automatically.
* Logging every step is wasteful on long runs. Logging scalars every step is usually fine; histograms and images are cheaper if written every N steps or once per epoch.
* Large images, videos, and audio inflate event files quickly. The `max_outputs` argument on `tf.summary.image` and `tf.summary.audio` caps how many examples per call are written.
* If a process crashes before a flush, the last few seconds of summaries may be lost. Wrapping training in a `try` block that calls `writer.flush()` in the `finally` clause is cheap insurance.
* TensorBoard scales to many runs but slows down once a single run logs millions of scalar points. Downsampling or aggregating before logging keeps the UI responsive.
* Run directory naming dominates UX. Use a timestamp plus a short human readable label, for example `logs/2026-05-16_baseline_resnet50`. Random hashes or numeric ids are hard to scan in a sweep.
* When comparing experiments, keep tag names consistent across runs. TensorBoard groups identical tags into one chart, so renaming `loss` to `train_loss` halfway through a sweep splits the curve across two panels.
* For multi GPU runs under `tf.distribute.MirroredStrategy`, gate summary calls behind a `tf.distribute.get_replica_context().replica_id_in_sync_group == 0` check, or rely on the Keras callback to do it for you. Otherwise N replicas log N copies of the same value.
* Disable graph writing (`write_graph=False`) on very large models. The serialized graph can dominate event file size and slow down TensorBoard's initial load.
* The Profile tab is gated on the right Python plugin. `pip install -U tensorboard-plugin-profile` is what makes it appear after the first profile capture.

## Common pitfalls

A short list of bugs that beginners run into:

* **No writer activated.** Calling `tf.summary.scalar` outside of a `with writer.as_default():` block silently returns `False` and writes nothing. The return value of the call is the easiest signal.
* **Forgetting the step.** Pass `step=optimizer.iterations` or use `tf.summary.experimental.set_step`. Without a step, the value falls back to the default and stacks up on top of itself.
* **Logging Python ints vs `tf.Tensor` types.** Both work in eager mode. Inside `tf.function`, only tensor values are supported, so wrap Python scalars with `tf.constant` or rely on the implicit conversion.
* **Histograms on tensors with NaN.** A single NaN poisons the binning and produces an empty distribution. Filter out NaN values, or clip before logging.
* **Out of range image values.** `tf.summary.image` expects either `uint8` in `[0, 255]` or `float32` in `[0, 1]`. Logging unnormalised activations as images often produces saturated white tiles.
* **Logging from a `tf.data` map function.** `map` runs on a separate thread that does not inherit the default writer. Move logging into the training step.

## Other meanings of "summary" in machine learning

The word also appears in unrelated contexts. `model.summary()` in Keras prints a human readable table of layers, shapes, and parameter counts. Text summarization is the task of producing a short version of a document, handled by models such as BART, T5, and Pegasus. "Data summary" can mean descriptive statistics over a dataset, or it can refer to broader techniques for reducing data and models, such as dimensionality reduction with PCA or t SNE, model compression by [quantization](/wiki/quantization), pruning, and knowledge distillation, and ensemble methods like [bagging](/wiki/bagging) and [boosting](/wiki/boosting). Those are separate concepts that share a name; in TensorFlow code, "summary" almost always means a `tf.summary` event.

## History

`tf.summary` predates the rest of TensorBoard. In early TensorFlow 0.x releases, summaries were tied to a `SummaryWriter` that lived in the same process as the training loop and shared a `Session`. The mental model assumed a single graph, a single session, and explicit `merge_all` calls. That model was awkward in research code that wanted to log custom values on the fly, and it broke entirely when eager execution arrived.

The redesign for TensorFlow 2 was tracked through a public RFC process on GitHub during 2018 and 2019. The shipping API removed `merge_all`, replaced `FileWriter` with `create_file_writer`, made the global step explicit, and renamed the second positional argument from `tensor` to `data`. The migration script `tf_upgrade_v2` was released alongside TensorFlow 2.0 in September 2019 to ease porting. The compat layer `tf.compat.v1.summary` retains the v1 surface for code that has not been ported.

Around the same time, the broader experiment tracking ecosystem grew. Weights & Biases launched its public beta in early 2018, MLflow released its first stable version in mid 2018, Neptune.ai expanded its product around the same period, and Aim and ClearML appeared shortly after. All of them shipped TensorBoard sync paths, partly because TensorBoard was already the de facto local viewer and partly because the `tfevents` format was easy to ingest. The result is that `tf.summary` today sits at the bottom of a layered stack: the raw events on disk feed both TensorBoard and any number of higher level platforms.

## References

* TensorFlow, *Module: tf.summary*, https://www.tensorflow.org/api_docs/python/tf/summary
* TensorFlow, *tf.summary.scalar*, https://www.tensorflow.org/api_docs/python/tf/summary/scalar
* TensorFlow, *tf.summary.histogram*, https://www.tensorflow.org/api_docs/python/tf/summary/histogram
* TensorFlow, *tf.summary.image*, https://www.tensorflow.org/api_docs/python/tf/summary/image
* TensorFlow, *tf.summary.audio*, https://www.tensorflow.org/api_docs/python/tf/summary/audio
* TensorFlow, *tf.summary.text*, https://www.tensorflow.org/api_docs/python/tf/summary/text
* TensorFlow, *tf.summary.create_file_writer*, https://www.tensorflow.org/api_docs/python/tf/summary/create_file_writer
* TensorFlow, *tf.summary.SummaryWriter*, https://www.tensorflow.org/api_docs/python/tf/summary/SummaryWriter
* TensorFlow, *tf.keras.callbacks.TensorBoard*, https://www.tensorflow.org/api_docs/python/tf/keras/callbacks/TensorBoard
* TensorFlow, *Migrating tf.summary usage to TF 2.x*, https://www.tensorflow.org/tensorboard/migrate
* TensorFlow, *Get started with TensorBoard*, https://www.tensorflow.org/tensorboard/get_started
* TensorFlow, *Logging training data with TensorBoard*, https://www.tensorflow.org/tensorboard/scalars_and_keras
* TensorFlow, *Displaying image data in TensorBoard*, https://www.tensorflow.org/tensorboard/image_summaries
* TensorFlow, *Displaying text data in TensorBoard*, https://www.tensorflow.org/tensorboard/text_summaries
* TensorFlow, *Examining the TensorFlow Graph*, https://www.tensorflow.org/tensorboard/graphs
* TensorFlow, *TensorBoard Profiler for Keras*, https://www.tensorflow.org/tensorboard/tensorboard_profiling_keras
* TensorFlow, *Hyperparameter tuning with HParams*, https://www.tensorflow.org/tensorboard/hyperparameter_tuning_with_hparams
* TensorFlow, *TensorBoard.dev DataFrame API*, https://www.tensorflow.org/tensorboard/dataframe_api
* TensorBoard, *Mesh plugin README*, https://github.com/tensorflow/tensorboard/blob/master/tensorboard/plugins/mesh/README.md
* PyTorch, *torch.utils.tensorboard*, https://docs.pytorch.org/docs/main/tensorboard.html
* MLflow, *ML Experiment Tracking*, https://mlflow.org/docs/latest/ml/tracking/
* Weights and Biases, *Log objects and media*, https://docs.wandb.ai/models/track/log
* Weights and Biases, *TensorBoard integration*, https://docs.wandb.ai/guides/integrations/tensorboard/
* Aim, *Open source experiment tracker*, https://aimstack.io/

