See also: Machine learning terms, TensorFlow, Data Visualization
TensorBoard is an open-source visualization toolkit for machine learning experimentation. Originally developed by the Google Brain team as part of the TensorFlow ecosystem, TensorBoard provides interactive web-based dashboards that allow researchers and engineers to track metrics, visualize model architectures, inspect tensor distributions, profile performance, and much more. Although it was built with TensorFlow in mind, TensorBoard has grown into a framework-agnostic tool with official support for PyTorch, Keras, and other deep learning libraries.
TensorBoard reads data from event log files that training scripts generate during execution. These log files contain serialized protocol buffers ("summaries") that record scalar metrics, images, histograms, computational graphs, embeddings, and other data types. A lightweight web server then renders that data through a collection of dashboard plugins, each dedicated to a different visualization type. The result is a single browser-based interface where practitioners can monitor training runs in real time, compare experiments side by side, and diagnose problems without writing additional analysis code.
Since its initial release alongside TensorFlow 1.0 in 2015, TensorBoard has accumulated over 7,100 stars on GitHub and remains one of the most widely used visualization tools in the machine learning community.
TensorBoard traces its origins to Google Brain, the research division inside Google (later merged into Google DeepMind) that created TensorFlow. When TensorFlow was open-sourced in November 2015, TensorBoard shipped as a bundled companion tool. At that stage, TensorBoard offered a small, predetermined set of visualizations: scalars, histograms, and computational graph exploration. These covered the most common needs of deep learning practitioners but left little room for customization.
In September 2017, Google announced a new plugin API for TensorBoard. The motivation was straightforward: without reusable APIs, adding new visualizations required intimate knowledge of TensorBoard internals, and contributions from the broader community were rare. The plugin API split each visualization into three layers (data logging, backend serving, and frontend rendering), making it possible for external developers to build and distribute their own TensorBoard plugins as standard Python packages.
The transition from TensorFlow 1.x to TensorFlow 2.0 in 2019 brought significant changes to the summary API. In TensorFlow 1.x, users had to manually wire summary operations into the session graph and call Session.run() to generate log data, a workflow that was awkward under eager execution. TensorFlow 2.0 introduced a redesigned tf.summary module that writes data immediately when executed, making it natural to use in both eager and graph modes. The new API replaced the old tf.summary.merge_all() pattern with explicit file writers and context managers.
Over the years, Google added several major features to TensorBoard:
| Year | Milestone |
|---|---|
| 2015 | TensorBoard released with TensorFlow 1.0; Scalars, Histograms, and Graphs dashboards available |
| 2016 | Embedding Projector added for high-dimensional data visualization |
| 2017 | Plugin API launched, enabling third-party dashboard extensions |
| 2018 | What-If Tool introduced for model fairness and interpretability analysis |
| 2019 | HParams dashboard released for hyperparameter tuning; TensorBoard.dev launched as a free hosted sharing service; PyTorch added native TensorBoard support via torch.utils.tensorboard |
| 2020 | TensorFlow Profiler integrated into TensorBoard for GPU and TPU performance analysis |
| 2023 | TensorBoard.dev shut down on December 31; Google recommended Vertex AI TensorBoard as the managed replacement |
TensorBoard is licensed under the Apache License 2.0 and is written primarily in Python (backend) and TypeScript (frontend), with Bazel as the build system.
TensorBoard can be installed through several package managers. It is automatically included when you install TensorFlow, but it can also be installed independently for use with PyTorch or other frameworks.
pip install tensorboard
conda install -c conda-forge tensorboard
After a training script has written event files to a log directory, start TensorBoard from the command line:
tensorboard --logdir=runs
By default, TensorBoard serves its web interface at http://localhost:6006. You can change the port using the --port flag. The --logdir argument accepts a single directory or a comma-separated list of directories, allowing you to compare multiple experiments in one view.
TensorBoard can also run inline inside Jupyter notebooks using the %tensorboard magic command:
%load_ext tensorboard
%tensorboard --logdir runs
This embeds the TensorBoard interface directly in the notebook cell output, which is especially convenient in Google Colab environments.
TensorBoard expects each experiment run to live in its own subdirectory. Mixing event files from multiple runs in a single folder can produce confusing, overlapping plots. A typical directory layout looks like this:
runs/
experiment_1/
events.out.tfevents.1234567890.hostname
experiment_2/
events.out.tfevents.1234567891.hostname
When TensorBoard detects multiple subdirectories, it displays each as a separate run with its own color-coded line on the charts.
TensorBoard organizes its functionality into a series of dashboard plugins. Each plugin handles a specific type of data and provides its own interactive visualization. The following sections describe the built-in dashboards.
The Scalars dashboard is arguably the most frequently used component of TensorBoard. It plots scalar values (single numbers) over training steps or wall-clock time. Common use cases include tracking loss function values, accuracy, learning rate schedules, and custom metrics.
The dashboard supports smoothing controls that apply exponential moving averages to noisy curves, making it easier to spot trends. Users can overlay multiple runs on the same chart to compare experiments, zoom into specific step ranges, and toggle between step-based and time-based x-axes. Regex-based tag filtering lets users focus on specific metric groups when a training run logs many different scalars.
The Histograms dashboard visualizes the distribution of tensor values over time. This is particularly useful for monitoring how model weights, biases, and activation function outputs evolve during training. Each histogram represents a snapshot of a tensor at a particular training step, and consecutive snapshots are stacked along the time axis to form a 3D-like "ridge" plot.
Practitioners use histograms to detect common training pathologies. For example, weights that collapse toward zero may indicate a vanishing gradient problem, while weights that grow unboundedly may signal exploding gradients. Healthy training typically shows weight distributions that gradually narrow and stabilize as the model converges.
The Distributions dashboard provides a condensed, 2D alternative to the Histograms view. Instead of showing full histograms, it plots summary statistics (percentiles) of tensor distributions over time. The result resembles a set of overlapping confidence bands that convey the same information as histograms but in a more compact format.
The Distributions dashboard is often easier to read when comparing many tensors at once, since the flat 2D layout scales better than stacked 3D histograms.
The Images dashboard displays image data logged during training. Common uses include visualizing input training samples, intermediate feature maps from convolutional neural network layers, generated outputs from generative adversarial networks, and augmented training images produced by data preprocessing pipelines.
Images are logged with tags and step numbers, allowing users to scrub through the timeline and observe how visual outputs change as training progresses. This is invaluable for tasks in computer vision where numerical metrics alone do not fully capture model behavior.
The Audio dashboard works similarly to the Images dashboard but for audio data. It embeds playable audio widgets for waveforms logged via summary operations. This dashboard always shows the latest audio clip for each tag, with a slider to browse through earlier steps.
Audio logging is commonly used in speech recognition, text-to-speech synthesis, and music generation projects, where listening to model outputs is the most direct way to assess quality.
The Text dashboard renders text data logged during training. It supports Markdown formatting, which makes it useful for logging sample predictions from natural language processing models, configuration summaries, or free-form notes about an experiment.
The Graphs dashboard provides an interactive visualization of a model's computational graph. Nodes represent operations (such as matrix multiplications, convolutions, or activation functions), and edges represent the tensor data flowing between them. Users can expand and collapse grouped operations, search for specific nodes, and inspect the shapes and data types of tensors at each edge.
The Graphs dashboard is particularly helpful for debugging model architecture issues. Misconnected layers, unexpected tensor shapes, and redundant operations are often easier to spot in a visual graph than in source code. In TensorFlow 2.x, graph tracing is handled through tf.summary.trace_on() and tf.summary.trace_export(), which capture the graph structure of @tf.function-decorated code.
The Embedding Projector allows users to visualize high-dimensional data in two or three dimensions using dimensionality reduction techniques. It supports three projection methods:
| Method | Description |
|---|---|
| PCA | Principal Component Analysis projects data onto its top principal components, preserving maximum variance |
| t-SNE | t-distributed Stochastic Neighbor Embedding emphasizes local structure and is effective at revealing clusters |
| Custom | Users can specify their own linear projection axes |
The Projector is widely used for exploring word embeddings (such as those from Word2Vec or GloVe), image embeddings from classification models, and any other high-dimensional learned representations. Users can click on individual points to see their nearest neighbors, search for specific items by label, and adjust projection parameters in real time.
A standalone version of the Projector is also available at projector.tensorflow.org, where users can upload their own embedding data without installing TensorBoard locally.
The HParams (hyperparameters) dashboard helps practitioners analyze the results of hyperparameter tuning experiments. After logging hyperparameter values and corresponding metrics for each trial, the dashboard provides three coordinated views:
The HParams dashboard supports discrete, real-valued, and boolean hyperparameter types. It works with both grid search and random search strategies, as well as more advanced optimization methods like Bayesian optimization.
The PR Curves plugin plots precision-recall curves that show how a classifier's precision and recall change across different confidence thresholds. This is particularly useful for evaluating models on imbalanced datasets where accuracy alone can be misleading. The area under the PR curve provides a single-number summary of performance, and the interactive threshold slider lets users explore the tradeoff between precision and recall at specific operating points.
The Mesh plugin renders 3D point clouds and triangulated meshes directly in TensorBoard. Users can rotate, zoom, and pan through the 3D scene. The plugin supports configurable lighting, camera positions, and material properties. It is used in computer vision research involving 3D reconstruction, depth estimation, and point cloud segmentation.
The Custom Scalars plugin extends the basic Scalars dashboard by allowing users to define multi-line charts that combine scalar data from different tags into a single plot, with optional margin areas (for example, to show mean plus/minus standard deviation). The layout is specified through a protocol buffer configuration, giving users fine-grained control over chart organization.
TensorBoard integration with TensorFlow is handled through the tf.summary API. In TensorFlow 2.x, the typical workflow looks like this:
import tensorflow as tf
# Create a summary writer
writer = tf.summary.create_file_writer("runs/experiment_1")
# Inside the training loop
for step in range(num_steps):
loss = train_step(model, data)
with writer.as_default():
tf.summary.scalar("loss", loss, step=step)
tf.summary.scalar("learning_rate", optimizer.learning_rate, step=step)
For Keras users, the integration is even simpler. Keras provides a built-in TensorBoard callback that automatically logs training and validation metrics, model graphs, weight histograms, and more:
from tensorflow.keras.callbacks import TensorBoard
tb_callback = TensorBoard(
log_dir="runs/experiment_1",
histogram_freq=1, # Log weight histograms every epoch
write_graph=True, # Log the model graph
write_images=True, # Log weight images
update_freq="epoch", # Log metrics per epoch
profile_batch="500,520" # Profile batches 500 through 520
)
model.fit(
train_data,
epochs=50,
callbacks=[tb_callback]
)
The Keras callback handles file writer creation, metric logging, and cleanup automatically. Setting histogram_freq to a positive integer enables weight histogram logging at the specified epoch interval, which is useful for monitoring overfitting and weight distribution health.
Starting with PyTorch 1.2 (released in 2019), the torch.utils.tensorboard module provides native TensorBoard support. Before this official integration, the community relied on the third-party tensorboardX library (also known as tensorboard-pytorch), which pioneered PyTorch-TensorBoard interoperability.
The central class in the PyTorch integration is SummaryWriter, which creates event files and provides methods for logging various data types:
from torch.utils.tensorboard import SummaryWriter
writer = SummaryWriter("runs/experiment_1")
for epoch in range(num_epochs):
train_loss = train_one_epoch(model, train_loader)
val_loss = evaluate(model, val_loader)
writer.add_scalar("Loss/train", train_loss, epoch)
writer.add_scalar("Loss/val", val_loss, epoch)
writer.add_histogram("layer1.weight", model.layer1.weight, epoch)
writer.add_graph(model, sample_input)
writer.close()
The SummaryWriter class writes data asynchronously, so logging calls do not block the training loop. The following table summarizes the key logging methods:
| Method | Purpose | Dashboard |
|---|---|---|
add_scalar() | Log a single scalar value | Scalars |
add_scalars() | Log multiple scalars under one tag group | Scalars |
add_histogram() | Log a tensor's value distribution | Histograms |
add_image() | Log a single image | Images |
add_images() | Log a batch of images | Images |
add_figure() | Log a Matplotlib figure | Images |
add_video() | Log a video clip | Images |
add_audio() | Log an audio waveform | Audio |
add_text() | Log a text string | Text |
add_graph() | Log the model computational graph | Graphs |
add_embedding() | Log high-dimensional embedding data | Projector |
add_pr_curve() | Log precision-recall data | PR Curves |
add_mesh() | Log 3D point cloud or mesh data | Mesh |
add_hparams() | Log hyperparameters and associated metrics | HParams |
PyTorch Lightning provides first-class TensorBoard support through its TensorBoardLogger class. In fact, TensorBoard is the default logger in Lightning, meaning it works out of the box without extra configuration.
from lightning.pytorch import Trainer
from lightning.pytorch.loggers import TensorBoardLogger
logger = TensorBoardLogger("tb_logs", name="my_model")
trainer = Trainer(logger=logger, max_epochs=100)
trainer.fit(model, train_dataloader, val_dataloader)
Inside a Lightning module, metrics are logged using self.log(), which automatically routes values to the configured logger. Logged metrics appear in TensorBoard under tags that correspond to the metric names. Lightning also supports custom callbacks that access the underlying TensorBoard writer for advanced logging (such as writing weight histograms or sample images at the end of each validation epoch).
After training, the logs can be viewed with:
tensorboard --logdir=tb_logs/
The TensorFlow Profiler is a performance analysis tool integrated into TensorBoard that helps identify computational bottlenecks in model training. It captures detailed timing information about operations running on CPUs, GPUs, and TPUs, then presents the results through several specialized views.
The overview page provides a high-level summary of a profiling session, including an average step-time breakdown that shows how much time is spent on input processing, device computation, host computation, and communication. It also provides automated recommendations for improving performance, such as suggestions to optimize the data input pipeline or to use mixed-precision training.
The Trace Viewer displays a detailed timeline of all operations executed during the profiled steps. The horizontal axis represents time, and each row corresponds to a different thread or device stream. Users can zoom into specific regions to inspect individual kernel launches, memory copies between host and device, and synchronization points.
The Trace Viewer is essential for diagnosing GPU underutilization. In an efficient training loop, the GPU should be busy almost continuously, with minimal gaps between kernel executions. Long gaps typically indicate that the GPU is waiting for data from the host (an input pipeline bottleneck) or waiting for the host to enqueue new operations (a host-side CPU bottleneck).
The Input Pipeline Analyzer breaks down the time spent in each stage of the tf.data input pipeline. It identifies which pipeline operations (such as reading from disk, parsing, batching, or augmentation) are the slowest, and it provides specific recommendations for optimization. Common suggestions include prefetching data, parallelizing map transformations, and caching datasets that fit in memory.
The Memory Profile view shows GPU memory usage over time during the profiled steps. It reveals peak memory consumption and can help identify opportunities to reduce memory usage through techniques like gradient checkpointing, smaller batch sizes, or more memory-efficient model architectures.
The Kernel Statistics view lists all GPU kernels executed during profiling, sorted by total execution time. This makes it straightforward to identify the most expensive operations in the model. Users can see the average duration, number of invocations, and the percentage of total GPU time consumed by each kernel.
TensorBoard also supports profiling for PyTorch workloads. The torch.profiler module can export trace data in a format that TensorBoard understands, providing similar timeline and performance analysis views. The PyTorch profiler plugin supports GPU kernel analysis, memory profiling, and operator-level timing breakdowns.
import torch.profiler
with torch.profiler.profile(
activities=[torch.profiler.ProfilerActivity.CPU,
torch.profiler.ProfilerActivity.CUDA],
on_trace_ready=torch.profiler.tensorboard_trace_handler("runs/profiler"),
record_shapes=True,
with_stack=True
) as prof:
for step, data in enumerate(train_loader):
train_step(model, data)
prof.step()
The What-If Tool (WIT) is a TensorBoard plugin developed by Google's PAIR (People + AI Research) team that provides a code-free interface for exploring trained machine learning models. It allows users to perform "what-if" analyses by editing individual data points and observing how predictions change.
Key capabilities of the What-If Tool include:
The What-If Tool was introduced in 2018 and integrates with TensorFlow models served through TensorFlow Serving. It is no longer under active development as of TensorBoard 2.12; Google recommends the Learning Interpretability Tool (LIT) as its successor for model analysis and interpretability.
In December 2019, Google launched TensorBoard.dev, a free cloud-hosted service that allowed users to upload TensorBoard log data and share interactive dashboards via a public URL. The service required a Google Account for uploading but allowed anyone with the link to view the dashboard without authentication.
Uploading experiments was handled through a simple command:
tensorboard dev upload --logdir runs/experiment_1
TensorBoard.dev was useful for sharing results in academic papers, blog posts, and collaborative research. However, the service had limitations: a 10 million data point limit per user, no privacy controls beyond link sharing, and no support for deleting individual experiments after upload.
Google shut down TensorBoard.dev on December 31, 2023. After that date, the tensorboard dev subcommand displays an error message and does not send any requests to the former service.
As the replacement for TensorBoard.dev, Google offers Vertex AI TensorBoard as part of Google Cloud Platform. Vertex AI TensorBoard provides managed, persistent TensorBoard instances with enterprise features:
Vertex AI TensorBoard originally required a per-user monthly license costing $300. In 2024, Google changed the pricing model to a storage-based fee of $10 per GiB per month, removing the subscription requirement and making the service more accessible for smaller teams and individual practitioners.
TensorBoard's extensibility comes from its plugin-based architecture. Each visualization component is a self-contained plugin that can be developed, distributed, and installed independently. The architecture consists of three layers:
Plugins expose Python APIs that users call from their training scripts to write data. These APIs produce summary protocol buffers containing a tag (string identifier), a step (temporal index), a tensor (the actual data), and metadata that specifies which plugin owns the summary. The summaries are serialized and appended to event files on disk.
The backend is a Python web server that reads event files, performs any necessary post-processing, and serves the processed data through HTTP endpoints. Each plugin registers its routes (URLs) with TensorBoard's core server. The backend handles data multiplexing, which allows a single TensorBoard instance to serve data from multiple log directories.
The frontend consists of web components (originally built with Polymer, later migrating to standard web components) that render the visualizations in the browser. Each plugin provides its own frontend module that fetches data from the backend and draws interactive charts, images, or other visual elements.
Plugin discovery uses the Python entry_points mechanism. When TensorBoard starts, it scans installed packages for declared TensorBoard plugins and loads them automatically. This means users can install third-party plugins via pip and have them appear in TensorBoard without any manual configuration:
pip install some-tensorboard-plugin
tensorboard --logdir runs/
Notable third-party plugins include the Open3D TensorBoard plugin for advanced 3D visualization, and various domain-specific plugins for audio analysis, natural language evaluation, and more.
While TensorBoard remains one of the most widely used visualization tools for machine learning, several alternative platforms have emerged that offer different tradeoffs between features, scalability, and ease of use.
| Feature | TensorBoard | Weights & Biases | MLflow | Comet ML | ClearML | Aim |
|---|---|---|---|---|---|---|
| Pricing | Free, open-source | Free tier; paid plans for teams | Free, open-source | Free tier; paid plans | Free tier; paid plans | Free, open-source |
| Hosting | Self-hosted (local) | Cloud-hosted or self-hosted | Self-hosted or managed | Cloud-hosted or self-hosted | Cloud-hosted or self-hosted | Self-hosted |
| Framework Support | TensorFlow, PyTorch, Keras, and others | Framework-agnostic | Framework-agnostic | Framework-agnostic | Framework-agnostic | Framework-agnostic |
| Team Collaboration | Limited (single-user oriented) | Strong (dashboards, reports, alerts) | Moderate (shared tracking server) | Strong (comments, tagging, user management) | Good (multi-user support) | Moderate (shared server) |
| Experiment Comparison | Basic (overlaying runs) | Advanced (custom tables, grouping) | Good (search and filter) | Advanced (diff views, custom panels) | Good (metric-based sorting) | Advanced (custom queries) |
| Hyperparameter Tracking | HParams dashboard | Sweeps with built-in optimization | Parameter logging and search | Built-in optimization | Built-in optimization | Exploratory search |
| Auto-logging | No (manual summary calls) | Yes | Yes (for popular frameworks) | Yes | Yes | Yes |
| Data/Code Versioning | No | Artifacts system | Model Registry | Artifacts | Dataset versioning | No |
| Scalability | Degrades with many runs | Handles large-scale experiments | Good with remote server | Good | Good | Good (local performance) |
TensorBoard's primary advantages are its zero cost, deep integration with TensorFlow and Keras, and the fact that it requires no external service or account. For individual researchers working with TensorFlow, it is often the simplest path to experiment visualization. The computational graph viewer and the Embedding Projector remain features that few competitors replicate with the same level of polish.
TensorBoard's main weaknesses become apparent at scale. It does not store experiment metadata (such as code versions, environment details, or Git commits) and therefore cannot provide full reproducibility on its own. Its single-user, local-first design makes team collaboration difficult without additional infrastructure. The web interface can become sluggish when loading thousands of runs or millions of data points. Additionally, TensorBoard lacks auto-logging capabilities, so users must manually instrument their code with summary calls.
For teams that need robust collaboration features, experiment management at scale, and automated logging, platforms like Weights & Biases, MLflow, or Comet ML are common choices. Many practitioners use TensorBoard for quick local visualization during development and switch to a managed platform for production-scale experiment tracking.
TensorBoard excels at overlaying metrics from multiple training runs. By organizing experiments into subdirectories under a common parent folder, users can compare learning curves, identify the best-performing configuration, and spot anomalies. The sidebar provides checkboxes to toggle individual runs on and off, and a regex filter to select runs matching a specific naming pattern.
The Scalars dashboard includes a smoothing slider that controls the degree of exponential moving average applied to plotted curves. A smoothing value of 0 shows raw data, while values closer to 1 produce increasingly smooth curves. This is useful for training runs with high variance in per-step metrics (common with small mini-batch sizes) where the raw plot is too noisy to interpret.
When training scripts log many different metrics, the TensorBoard interface can become crowded. Tag filtering (using regular expressions) allows users to focus on a specific subset of metrics. For example, a regex like Loss/.* would show only tags starting with "Loss/".
The Custom Scalars plugin supports user-defined chart layouts that group related metrics together. This is configured through a protocol buffer that specifies which tags appear on each chart and whether margin areas (upper/lower bounds) should be displayed.
When TensorBoard runs on a remote machine (such as a cloud GPU instance), users can access it through SSH port forwarding:
ssh -L 6006:localhost:6006 user@remote-host
This tunnels the remote TensorBoard port to the local machine, allowing the user to open http://localhost:6006 in a local browser.
Developers who need visualizations beyond what built-in plugins offer can create custom TensorBoard plugins. The process involves:
TBPlugin and registers HTTP routesGoogle provides a plugin example repository and documentation to guide developers through this process.
Imagine you're building a Lego tower, and you want to make sure it's strong and stable. You also want to understand how each Lego piece is connected and how they work together. TensorBoard is like a magnifying glass that helps you see and understand how your Lego tower (or machine learning model) is built, how strong it is, and how you can make it even better. It shows you pictures and graphs to help you see what's happening with your model, making it easier to fix any problems or make improvements.