# Node (TensorFlow graph)

> Source: https://aiwiki.ai/wiki/node_tensorflow_graph
> Updated: 2026-04-27
> Categories: Deep Learning, Developer Tools
> From AI Wiki (https://aiwiki.ai), a free encyclopedia of artificial intelligence. Quote with attribution.

*See also: [Machine learning terms](/wiki/machine_learning_terms), [TensorFlow](/wiki/tensorflow), [Computational graph](/wiki/computational_graph), [Tensor](/wiki/tensor)*

## Node (TensorFlow graph)

In the context of [machine learning](/wiki/machine_learning), a **node** is a fundamental unit within a [computational graph](/wiki/computational_graph), which is a directed, acyclic graph (DAG) used to represent the flow of data and operations in a [TensorFlow](/wiki/tensorflow) model. A TensorFlow graph is composed of multiple nodes, each representing an operation or a variable, which are connected by edges representing the flow of data between these nodes. The TensorFlow graph is a core component of the TensorFlow library, which is an open-source software library for numerical computation and machine learning, developed by the [Google Brain](/wiki/google_brain) team and first released in November 2015 under the Apache 2.0 license.

In the TensorFlow Python API, a node is concretely represented by an instance of the `tf.Operation` class (often abbreviated as an *op*), and the data that flows along an edge between two nodes is represented by an instance of the `tf.Tensor` class. A whole graph is represented by an instance of `tf.Graph`. Together, these three abstractions form what the official TensorFlow documentation calls a **dataflow graph**, the data structure that describes a TensorFlow computation independently of the language used to construct it.[^tf-intro]

The node concept is central to TensorFlow because the graph it lives inside is what makes the framework portable, parallelizable, and deployable. A graph can be saved to disk as a [GraphDef](/wiki/graphdef) protocol buffer, restored on a server, sent to a mobile phone, or compiled to GPU and TPU machine code by [XLA](/wiki/xla), all without needing the original Python program. Understanding what a node is, what it stores, and how it relates to its neighbors is therefore a prerequisite for understanding how TensorFlow trains, serves, and optimizes models.

## Computational graph foundation

A computational graph in mathematics and computer science is a directed graph whose vertices represent operations and whose edges represent operands. Every modern [deep learning](/wiki/deep_learning) framework, including TensorFlow, [PyTorch](/wiki/pytorch), [JAX](/wiki/jax), and [MXNet](/wiki/mxnet), is built around some version of this idea. The graph captures the structure of a calculation as a static data structure that can be analyzed, rewritten, scheduled, and differentiated.

A TensorFlow graph is more specifically a **dataflow graph**, a model that comes from the dataflow architectures studied by Jack Dennis at MIT in the 1970s. In a dataflow graph, an operation fires whenever all of its inputs are available. This makes the graph naturally parallel: any two nodes whose dependencies are satisfied can run at the same time, possibly on different devices. The TensorFlow whitepaper by Abadi and colleagues at Google describes this design explicitly, noting that TensorFlow "uses dataflow graphs to represent computation, shared state, and the operations that mutate that state" and maps the nodes of those graphs across many machines and across many devices within a single machine.[^abadi-osdi]

The basic mathematical idea is simple. Suppose you want to compute `y = relu(matmul(x, W) + b)`. As a dataflow graph this becomes four nodes (a `MatMul`, an `Add`, and a `Relu`, plus the input `x`, the variable `W`, and the variable `b` as additional source nodes) connected by edges that carry intermediate tensors. The graph encodes both the values that flow and the order in which they have to be produced. From this single description, TensorFlow can derive the [forward pass](/wiki/forward_pass), the [backward pass](/wiki/backpropagation) (by adding gradient nodes), the device placement, and the runtime schedule.

## Anatomy of a node

In the TensorFlow runtime, every node is encoded as a `NodeDef` protocol buffer message defined in `tensorflow/core/framework/node_def.proto`. A `NodeDef` is small and self-describing, which is what allows TensorFlow graphs to be serialized and restored without the original Python program.[^tool-dev] At runtime each node is also wrapped by a `tf.Operation` Python object that exposes the same information through a higher-level API.

The table below summarizes the most important fields exposed by `tf.Operation` and the corresponding `NodeDef` proto.

| Field | Type | Meaning |
|---|---|---|
| `name` | string | Unique identifier for this node within its graph (e.g. `dense_1/MatMul`). |
| `op` (or `type`) | string | The kind of operation, registered in the TensorFlow op registry (e.g. `MatMul`, `Conv2D`, `Relu`, `Identity`). |
| `inputs` | list of `Tensor` | Tensors consumed by this node. Each entry references an output of another node. |
| `outputs` | list of `Tensor` | Tensors produced by this node. Other nodes can take these as inputs. |
| `attr` | map of name to `AttrValue` | Static attributes such as `dtype`, `padding`, `strides`, kernel shape, etc. |
| `device` | string | Device specification, e.g. `/job:worker/replica:0/task:0/device:GPU:0`. |
| `control_inputs` | list of `Operation` | Other nodes that must finish executing before this node fires, even though no tensor flows between them. |

The `op` field is the bridge between the abstract description of the graph and the concrete kernel that will run on a device. When a graph is loaded, the op name is looked up in a registry (for example, `MatMul` resolves to a CUDA kernel on GPU, an Eigen kernel on CPU, and a TPU implementation when running on TPU). This indirection is what allows the same `GraphDef` to run on a phone, a server, and an accelerator.

Nodes can have zero inputs (sources, such as `Const`, `Variable`, or `Placeholder`), zero outputs (sinks, such as `NoOp` or `Save`), or any number of each. They can also have **control edges**, which are dashed edges in TensorBoard that represent ordering constraints with no data attached. Control edges are how variable assignments, queue operations, and side effects are sequenced in a graph that otherwise only sees pure data dependencies.

## Node types

TensorFlow ships with several thousand built-in op kinds, but they fall into a small number of categories that every user encounters.

| Category | Examples | Role |
|---|---|---|
| Arithmetic ops | `Add`, `Sub`, `Mul`, `Div`, `MatMul`, `Conv2D` | Perform numerical computation on input tensors and produce output tensors. |
| Activation and reduction | `Relu`, `Sigmoid`, `Softmax`, `ReduceSum`, `ReduceMean` | Element-wise nonlinearities and reductions used in neural networks. |
| Variables | `Variable`, `VariableV2`, `VarHandleOp`, `ReadVariableOp`, `AssignVariableOp` | Hold mutable state that persists across calls (weights, biases, optimizer slots). |
| Constants | `Const` | Immutable tensors baked into the graph. |
| Placeholders | `Placeholder`, `PlaceholderWithDefault` | Symbolic input slots. Used to feed external data in TF 1.x style programs. |
| Control flow | `Switch`, `Merge`, `Enter`, `Exit`, `NextIteration`, `LoopCond`, `tf.cond`, `tf.while_loop` | Implement conditionals and loops within a graph. AutoGraph translates Python `if` and `while` into these ops. |
| Queue and dataset ops | `FIFOQueue`, `IteratorGetNext`, `MapDataset`, `BatchDataset` | Stage and pipeline input data for `tf.data` input pipelines. |
| I/O and serialization | `RestoreV2`, `SaveV2`, `ReadFile`, `DecodeJpeg` | Read and write tensors from disk or other sources. |
| Communication | `Send`, `Recv`, `CollectiveReduce`, `AllReduce` | Move tensors between devices and between machines in distributed runs. |
| No-ops and grouping | `NoOp`, `Identity`, `Group` | Used by the runtime to coordinate execution and as anchors for control dependencies. |

Variables deserve a special mention because they are the only nodes that carry mutable state. A `Variable` op produces a tensor whose value can be modified in place by `AssignVariableOp` and read by `ReadVariableOp`. During training, the [optimizer](/wiki/optimizer) inserts these read and assign ops next to each variable so that gradient updates can be applied without breaking the dataflow contract.

## TensorFlow 1.x graphs versus TensorFlow 2.x eager execution

The meaning of a node is the same across TensorFlow versions, but the way users interact with the graph has changed considerably between TensorFlow 1.x and TensorFlow 2.x.

In **TensorFlow 1.x**, the graph was the primary interface. A typical program had two phases: a *construction* phase, where Python code added nodes to the default graph by calling functions like `tf.matmul`, `tf.constant`, or `tf.placeholder`, and an *execution* phase, where the user opened a `tf.Session` and called `session.run(fetches, feed_dict=...)` to actually compute tensor values. Nothing numeric happened in the construction phase; calling `tf.matmul(a, b)` simply added a `MatMul` node to the graph and returned a symbolic `tf.Tensor` handle. Inputs were supplied through `tf.placeholder` nodes whose values were filled in by the `feed_dict` argument at session run time.

This model is sometimes called **define-and-run** or **static graph** programming. It is fast and very deployable because the whole computation is known up front, but it is also awkward: control flow has to be written using ops like `tf.cond` and `tf.while_loop`, debugging requires special graph-aware tools, and intermediate tensor values are not visible without an explicit `session.run` call.

In **TensorFlow 2.x**, released in September 2019, eager execution became the default. Operations now run immediately when the corresponding Python function is called, in the same way as NumPy or PyTorch. Calling `tf.matmul(a, b)` with concrete tensors returns a concrete tensor, not a graph node. Sessions and placeholders are no longer needed in regular user code; the equivalent classes live in `tf.compat.v1` for backward compatibility.

Graphs did not disappear in TensorFlow 2. They were instead pushed under a single new abstraction: `tf.function`. When a Python function is decorated with `@tf.function`, TensorFlow runs the function once with tracing turned on, records every TensorFlow op it calls into a fresh `tf.Graph`, and afterwards executes that graph instead of the Python code on subsequent calls. This gives users the interactive feel of eager execution during development and the performance and portability of a static graph in production.[^tf-intro]

The table below contrasts the two execution modes.

| Aspect | TF 1.x graph mode | TF 2.x eager mode | TF 2.x with `tf.function` |
|---|---|---|---|
| Default behavior | Graph construction | Immediate execution | Trace once, run as a graph |
| Interface for inputs | `tf.placeholder` + `feed_dict` | Regular Python arguments | Regular Python arguments |
| Driver | `tf.Session.run()` | Python interpreter | Cached `ConcreteFunction` |
| Control flow | `tf.cond`, `tf.while_loop` | Native Python | AutoGraph rewrites Python into graph ops |
| Debugging | Graph-aware tools, `tf.Print` | `print`, `pdb`, breakpoints | Eager during dev via `tf.config.run_functions_eagerly(True)` |
| Typical use | Production training and serving | Research and prototyping | Production paths in modern code |

## tf.function and AutoGraph

`tf.function` is the modern way to build TensorFlow graphs. It takes an ordinary Python function and turns it into a callable object (a `PolymorphicFunction`) that lazily compiles one or more `ConcreteFunction` objects, each backed by a `tf.Graph`. The first time the function is called with a new combination of input shapes and dtypes, TensorFlow performs a process called **tracing**: it runs the Python body once, recording every TensorFlow op it issues and discarding the rest of the Python side effects. The resulting graph is then executed natively by the TensorFlow runtime on every subsequent call with the same input signature.[^tf-function]

A simple example illustrates the idea.

```python
import tensorflow as tf

@tf.function
def linear(x, W, b):
    return tf.nn.relu(tf.matmul(x, W) + b)

W = tf.Variable(tf.random.normal((4, 3)))
b = tf.Variable(tf.zeros((3,)))
x = tf.random.normal((2, 4))

y = linear(x, W, b)  # First call traces a graph with 5 nodes.
y = linear(x, W, b)  # Second call runs the cached graph directly.
```

During the first call, TensorFlow records a graph that contains a `MatMul` node, an `Add` node, a `Relu` node, the variable read ops for `W` and `b`, and the implicit input placeholders for `x`. On subsequent calls with tensors of the same shape and dtype, this graph runs directly with no Python overhead.

Because tracing only happens during the first call, plain Python `print` statements run only once, while `tf.print` runs every time. This is a common debugging gotcha. Adding a `print("tracing")` line to a `tf.function` is in fact the standard way to detect unwanted retracing.

**AutoGraph** is the library inside `tf.function` that allows ordinary Python control flow to participate in the graph. AutoGraph transforms a subset of Python code into graph-compatible TensorFlow ops at trace time. `if` statements that branch on a `tf.Tensor` become `tf.cond` nodes; `while` and `for` loops over tensors become `tf.while_loop` nodes; `break`, `continue`, and `return` get translated into the corresponding loop control signals.[^autograph] You can inspect the rewritten code with `tf.autograph.to_code(my_function)`.

A short example shows AutoGraph in action.

```python
@tf.function
def relu_like(x):
    if tf.reduce_sum(x) > 0:
        return x
    else:
        return tf.zeros_like(x)
```

AutoGraph rewrites the `if` into a `tf.cond` so that the conditional becomes part of the dataflow graph rather than a Python-side branch. Loops over `tf.range` are similarly rewritten into `tf.while_loop`. Loops over plain Python `range`, by contrast, are unrolled at trace time, which is why mixing the two in performance-critical code can produce surprises.

## Inspecting nodes in code

Because every node is a first-class object, you can walk a graph and print its nodes from Python. Inside a `tf.function`, the graph for a particular concrete signature is reachable through `get_concrete_function`.

```python
import tensorflow as tf

@tf.function
def f(x, y):
    return tf.nn.relu(tf.matmul(x, y))

concrete = f.get_concrete_function(
    tf.TensorSpec((None, 4), tf.float32),
    tf.TensorSpec((4, 3), tf.float32),
)

for op in concrete.graph.get_operations():
    print(op.name, op.type,
          [t.name for t in op.inputs],
          [t.shape for t in op.outputs])
```

For a typical run this prints the input placeholder ops, a `MatMul`, a `Relu`, an `Identity` that exposes the function's return value, and a small number of bookkeeping nodes inserted by the runtime. The `op.name`, `op.type`, `op.inputs`, `op.outputs`, `op.control_inputs`, and `op.device` properties together describe everything the runtime needs to execute that node.

## GraphDef and SavedModel

A `tf.Graph` is the in-memory representation of a TensorFlow graph. To serialize one, TensorFlow uses `GraphDef`, a protocol buffer message defined in `tensorflow/core/framework/graph.proto`. A `GraphDef` is essentially a list of `NodeDef` messages plus a `versions` field that records which set of op semantics the graph was built against.[^tool-dev]

The usual way to obtain a `GraphDef` from Python is `graph.as_graph_def()`. The result can be written to disk in either binary protobuf format (typically with the suffix `.pb`) or text protobuf format (`.pbtxt`), and reloaded later with `tf.graph_util.import_graph_def`. Because protocol buffers are language-neutral, the same `GraphDef` can be consumed from C++, Java, Go, JavaScript, Swift, and several other languages.

On top of `GraphDef`, TensorFlow defines several richer formats for full models:

| Format | Container | Purpose |
|---|---|---|
| `GraphDef` | `graph.proto` | Bare list of nodes plus op versions. No variable values. |
| `MetaGraphDef` | `meta_graph.proto` | A `GraphDef` plus signatures, collections, and variable metadata. |
| `SavedModel` | Directory with `saved_model.pb` and `variables/` subdirectory | One or more `MetaGraphDef`s plus checkpoint files. The recommended format for serving and deployment. |
| Frozen graph | Single `.pb` file | A `GraphDef` where variables have been replaced by `Const` nodes containing the trained values. Used for legacy mobile and TF Lite pipelines. |
| Checkpoint | `.ckpt`, `.index`, `.data-*` | Variable values only, no graph structure. Used during training. |

A `SavedModel` is the canonical way to ship a TensorFlow model for inference. It bundles the graph (or several graphs, one per `tag_set`), the values of every variable in a separate `variables/` directory, and a list of named signatures that map input and output tensor names to friendly keys. Tools such as [TensorFlow Serving](/wiki/tensorflow_serving), [TensorFlow Lite](/wiki/tensorflow_lite), and [TensorFlow.js](/wiki/tensorflow_js) all consume `SavedModel` directly, which is why TF 2.x users almost never have to touch `GraphDef` by hand.

## TensorBoard graph visualization

TensorBoard is the standard tool for inspecting TensorFlow graphs visually. Its **Graphs** dashboard reads the `GraphDef` recorded alongside training logs and renders it as an interactive diagram with zoom, pan, and click-to-expand support. Op nodes appear as ellipses, namespaces appear as rounded rectangles that group related ops, tensor edges are drawn as solid arrows whose thickness reflects tensor size, and control dependencies are drawn as dashed arrows.[^tb-graphs]

TensorBoard provides two complementary views:

| View | Source | What it shows |
|---|---|---|
| Op-level graph | `GraphDef` written by `tf.summary.trace_on` or `tf.summary.graph` | Every `tf.Operation` in the graph, including bookkeeping ops added by the optimizer or the input pipeline. |
| Conceptual graph | Keras model summary | The high-level layer structure of a `tf.keras.Model`, with one node per layer rather than one node per op. |

For a `tf.function`, the typical recipe is to bracket the call with the summary trace API:

```python
from datetime import datetime
import tensorflow as tf

@tf.function
def step(x, y):
    return tf.nn.relu(tf.matmul(x, y))

logdir = "logs/func/" + datetime.now().strftime("%Y%m%d-%H%M%S")
writer = tf.summary.create_file_writer(logdir)

tf.summary.trace_on(graph=True, profiler=False)
step(tf.random.uniform((3, 3)), tf.random.uniform((3, 3)))
with writer.as_default():
    tf.summary.trace_export(name="step", step=0)
```

For Keras models, passing `keras.callbacks.TensorBoard(log_dir=logdir)` to `model.fit` records the graph automatically. Once the logs are written, running `tensorboard --logdir logs` opens the dashboard and the **Graphs** tab shows the rendered model. Beyond visualization, the dashboard supports searching for a node by name, highlighting all upstream nodes that influence a selected node ("trace inputs"), and color-coding nodes by device placement, structure, or TPU compatibility.

## Graph optimization and Grappler

A raw graph produced by tracing is rarely what actually runs on the hardware. Before execution, TensorFlow passes the graph through **Grappler**, the default graph optimizer. Grappler is a meta-optimizer that applies a sequence of rewrite passes to a `tf.Graph`, simplifying it and improving its memory and runtime characteristics. Grappler runs automatically whenever a `tf.function` is executed, and most users never see it directly.[^grappler]

The table below summarizes the most important Grappler passes.

| Optimizer | What it does |
|---|---|
| Pruning | Removes nodes whose outputs are not needed by any fetch or side effect. Runs first to shrink the graph. |
| Constant folding | Replaces subgraphs whose inputs are all constants with a single `Const` node holding the precomputed value. |
| Arithmetic | Removes common subexpressions, simplifies expressions like `x * 1` and `x + 0`, and reorders associative operations. |
| Layout | Switches between `NHWC` and `NCHW` tensor layouts to match the format preferred by the target device, especially for [convolution](/wiki/convolution) operations. |
| Remapper | Fuses common patterns, for example `Conv2D + BiasAdd + Relu`, into a single optimized kernel. |
| Memory | Reduces peak memory usage, including swapping tensors between GPU and host memory when needed. |
| Dependency | Removes redundant control dependencies and no-op nodes. |
| Function | Inlines small `tf.function` calls to expose more optimization opportunities. |
| Loop | Hoists loop-invariant work out of `tf.while_loop` bodies. |
| Auto mixed precision | Casts compatible operations to `float16` or `bfloat16` on supporting hardware. |
| Debug stripper | Removes `tf.debugging` and assertion ops in production builds. |

Users who want to tune Grappler can call `tf.config.optimizer.set_experimental_options(...)` to toggle individual passes. For example, disabling `constant_folding` is a common debugging step when investigating numerical mismatches between eager and graph execution.

## XLA compilation

For maximum performance, a TensorFlow graph can be compiled all the way down to native machine code by **XLA**, the Accelerated Linear Algebra compiler. XLA replaces TensorFlow's default per-op execution with a fused, hardware-specific binary. It targets CPUs, GPUs, and TPUs from a common intermediate representation called HLO (High Level Optimizer IR), which represents the program as functional, statically typed linear algebra operations such as `Convolution`, `Dot`, `Reduce`, and `SelectAndScatter`.[^xla]

The usual way to opt in is through `tf.function`:

```python
@tf.function(jit_compile=True)
def matmul_relu(x, W, b):
    return tf.nn.relu(tf.matmul(x, W) + b)
```

With `jit_compile=True`, TensorFlow takes the traced graph, identifies the largest contiguous subgraphs (clusters) whose ops are all supported by the XLA backend, and replaces each cluster with a single `XlaRun` node that invokes the compiled binary. Inside that binary, multiple ops are typically fused into a single GPU kernel or CPU loop, which reduces memory bandwidth use and removes per-op dispatch overhead.

XLA is the default compiler on Cloud TPUs, where every TensorFlow program is JIT-compiled before execution. On CPU and GPU, XLA is opt-in because some ops are still unsupported and because the compilation itself takes time on the first call. The compiler has since been spun out into the [OpenXLA](/wiki/openxla) project and is shared with [JAX](/wiki/jax) and [PyTorch](/wiki/pytorch_xla)/XLA, which means the same HLO-level optimizations now back several major frameworks.

## Comparison with PyTorch

PyTorch and TensorFlow took opposite stances on graphs at the start. PyTorch, released in 2016, embraced eager execution as its only mode and built its `autograd` engine on a dynamic, define-by-run graph that is built fresh on every forward pass and torn down after the corresponding backward pass. TensorFlow 1.x, by contrast, required users to construct a static graph up front. Both frameworks have since converged toward a hybrid model in which eager execution is the default and a compiler turns hot Python code into a graph for performance. The table below summarizes the current state.

| Aspect | TensorFlow 1.x | TensorFlow 2.x (eager + `tf.function`) | PyTorch (eager) | PyTorch 2.x with `torch.compile` |
|---|---|---|---|---|
| Default mode | Static graph | Eager | Eager | Eager |
| Graph construction | Explicit, ahead of time | Traced lazily by `tf.function` | Built on the fly during the forward pass | Captured by [TorchDynamo](/wiki/torchdynamo) at bytecode level |
| Intermediate representation | `GraphDef` of `NodeDef`s | `tf.Graph` plus optional XLA HLO | `torch.fx.Graph` (when using `torch.fx`); ATen-level autograd graph internally | FX graph, then `torch.inductor` IR or other backends |
| Backward pass | Built by adding gradient nodes to the graph | Same, computed by `tf.GradientTape` | Built dynamically as the forward runs | Captured together with the forward graph |
| Control flow | `tf.cond`, `tf.while_loop` | AutoGraph rewrites Python into graph ops | Native Python | Captured by Dynamo; complex flow falls back to eager |
| Compiler | Grappler, XLA via `jit_compile` | Grappler, XLA | None by default | Inductor, NVFuser, OpenXLA backends |
| Deployment | `SavedModel`, frozen graph | `SavedModel`, TF Lite, TF.js | TorchScript, ONNX | AOT-compiled artifact (experimental) |
| Python at runtime | Not required | Not required inside a `tf.function` | Required | Optional inside compiled regions |

The upshot is that the *node* concept exists in both frameworks. In TensorFlow it is a `tf.Operation` recorded in a `tf.Graph`. In PyTorch it is a `torch.fx.Node` in an `fx.Graph`, or a `Function` instance in the autograd tape, depending on which path you take. The two frameworks differ less in graph semantics than they did five years ago. They mostly differ in when the graph is built, how visible it is to the user, and which compilation toolchain ingests it.

## Worked example: building and inspecting a small graph

The following self-contained snippet shows how to build a tiny TensorFlow graph, list its nodes, and write it to disk as a `GraphDef`. It works in TensorFlow 2.x.

```python
import tensorflow as tf

@tf.function
def tiny_model(x):
    W = tf.constant([[2.0], [3.0]])
    b = tf.constant([1.0])
    return tf.nn.relu(tf.matmul(x, W) + b)

# Trigger tracing and obtain the concrete function.
concrete = tiny_model.get_concrete_function(
    tf.TensorSpec((None, 2), tf.float32)
)

# 1. Walk the nodes.
for op in concrete.graph.get_operations():
    print(f"{op.name:30s}  type={op.type:10s}  inputs={len(op.inputs)}  outputs={len(op.outputs)}")

# 2. Serialize the graph to GraphDef.
graph_def = concrete.graph.as_graph_def()
with open("tiny_model.pb", "wb") as f:
    f.write(graph_def.SerializeToString())

print(f"Wrote graph with {len(graph_def.node)} NodeDef messages.")
```

The output lists the constant nodes for `W` and `b`, the placeholder for `x`, the `MatMul`, `Add` (or `BiasAdd`), `Relu`, and an `Identity` node that returns the result. Each row corresponds to one entry in the `GraphDef.node` repeated field. The same graph can later be loaded into another TensorFlow program, or visualized in TensorBoard by writing it through `tf.summary.graph`.

## Explain like I'm 5 (ELI5)

Imagine you are playing with building blocks, and each block represents a step in solving a math problem. Some blocks have numbers on them, and others have symbols like "+" or "x" for addition or multiplication. You can arrange these blocks in different ways to create different math problems and solve them step by step.

In machine learning, a TensorFlow graph works like these building blocks. Each block is called a node, and they can represent different things, like math operations or numbers that can change (like the scores in a game). When you connect the blocks (nodes) with lines (edges), you show the order of the steps to solve the problem. The whole setup of blocks and lines creates a flow of information, which helps the computer solve the problem and learn from it.

A helpful way to think about modern TensorFlow is that you usually write your math the way you would write any normal Python program, but if you wrap your code with `@tf.function`, TensorFlow secretly builds a block tower out of it the first time you run it. The next time you call your function, it does not rebuild the tower. It just pours marbles down the existing tower, which is much faster than rolling each marble by hand.

## References

[^tf-intro]: TensorFlow team. "Introduction to graphs and tf.function." TensorFlow Core guides. https://www.tensorflow.org/guide/intro_to_graphs
[^abadi-osdi]: Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard, M., Kudlur, M., Levenberg, J., Monga, R., Moore, S., Murray, D. G., Steiner, B., Tucker, P., Vasudevan, V., Warden, P., Wicke, M., Yu, Y., and Zheng, X. "TensorFlow: A System for Large-Scale Machine Learning." 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI '16), pp. 265-283, 2016. https://www.usenix.org/system/files/conference/osdi16/osdi16-abadi.pdf
[^tool-dev]: TensorFlow team. "A Tool Developer's Guide to TensorFlow Model Files." https://github.com/tensorflow/docs/blob/master/site/en/r1/guide/extend/model_files.md
[^tf-function]: TensorFlow team. "Better performance with tf.function." TensorFlow Core guides. https://www.tensorflow.org/guide/function
[^autograph]: TensorFlow team. "AutoGraph reference: Control flow." https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/autograph/g3doc/reference/control_flow.md
[^tb-graphs]: TensorFlow team. "Examining the TensorFlow Graph." TensorBoard guides. https://www.tensorflow.org/tensorboard/graphs
[^grappler]: TensorFlow team. "TensorFlow graph optimization with Grappler." TensorFlow Core guides. https://www.tensorflow.org/guide/graph_optimization
[^xla]: OpenXLA project. "XLA: Optimizing Compiler for Machine Learning." https://openxla.org/xla and TensorFlow team, "Use XLA with tf.function." https://www.tensorflow.org/xla/tutorials/jit_compile

Additional sources consulted:

- Abadi, M. et al. "TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems." arXiv:1603.04467, 2016. https://arxiv.org/abs/1603.04467
- TensorFlow team. "tf.Graph." API documentation. https://www.tensorflow.org/api_docs/python/tf/Graph
- TensorFlow team. "tf.Operation." API documentation. https://www.tensorflow.org/api_docs/python/tf/Operation
- TensorFlow team. "Using the SavedModel format." https://www.tensorflow.org/guide/saved_model
- PyTorch team. "Dynamo Overview." PyTorch documentation. https://pytorch.org/docs/stable/torch.compiler_dynamo_overview.html

