# Rank (Tensor)

> Source: https://aiwiki.ai/wiki/rank_tensor
> Updated: 2026-05-11
> Categories: Machine Learning
> From AI Wiki (https://aiwiki.ai), a free encyclopedia of artificial intelligence. Quote with attribution.

*See also: [Machine learning terms](/wiki/machine_learning_terms)*

## Introduction

In machine learning and deep learning, the **rank of a [tensor](/wiki/tensor)** is the number of dimensions (also called axes) the tensor has. It is the count of indices you need to specify in order to pick out a single scalar element. A [scalar](/wiki/scalar) is rank 0, a [vector](/wiki/vector) is rank 1, a [matrix](/wiki/matrix) is rank 2, and so on. This sense of rank is sometimes called the *order* or *degree* of the tensor, and in practical code it is what [NumPy](/wiki/numpy) exposes as `ndarray.ndim`, what [PyTorch](/wiki/pytorch) exposes as `Tensor.ndim` or `Tensor.dim()`, and what [TensorFlow](/wiki/tensorflow) returns from `tf.rank`.

This definition is the one used almost universally in [neural network](/wiki/neural_network) frameworks and tutorials. It is, however, distinct from a different and older notion of tensor rank from multilinear algebra (the minimum number of rank-1 terms needed to write the tensor as a sum, also known as the [CP rank](/wiki/cp_decomposition)). The two ideas share a name but mean different things, and confusion between them is one of the more common terminology traps in the field.

## What rank means precisely

A tensor is a generalization of scalars, vectors, and matrices. In deep learning libraries, the word *tensor* is essentially a synonym for a multidimensional array of numerical values that lives in CPU or GPU memory, usually as 32 bit or 16 bit floats, sometimes as integers, booleans, or complex numbers.

The rank of a tensor in this dimensional sense is simply the length of its shape tuple. If a tensor has shape `(2, 3, 4)`, then its rank is 3, because the shape has three entries. Each entry in the shape gives the length of one axis, and the product of those lengths is the total number of scalar elements stored.

A useful way to think about it: rank answers the question "how many indices do I need to fully specify one element?" For a matrix `M` you write `M[i, j]`, so the rank is 2. For a 3D tensor `T` you write `T[i, j, k]`, so the rank is 3. For a scalar `s` you need no indices, so the rank is 0.

## Rank, axes, and shape

Four related terms appear together often enough to deserve a side by side comparison.

| Term | What it is | Example for shape (32, 64, 128) |
|---|---|---|
| Rank | Number of axes (a single integer) | 3 |
| Axes | The individual dimensions, indexed 0, 1, 2, ... | axis 0, axis 1, axis 2 |
| Shape | Tuple giving the length of each axis | (32, 64, 128) |
| Size | Total scalar elements (product of shape) | 262,144 |

The rank is the *count* of dimensions; the shape is the *list of sizes*. A tensor of shape `(5,)` has rank 1, while a tensor of shape `(5, 1)` has rank 2. The two look similar on paper but they are different objects, and operations that depend on broadcasting or matrix multiplication treat them differently.

## Common ranks in machine learning

Most real workloads use tensors with rank between 0 and 5. Higher ranks appear in research but are less common in everyday training code.

| Rank | Common name | Typical example | Example shape |
|---|---|---|---|
| 0 | Scalar | A loss value, learning rate, prediction probability | `()` |
| 1 | Vector | A feature vector, embedding, or 1D signal | `(768,)` |
| 2 | Matrix | Weight matrix, grayscale image, batch of feature vectors | `(32, 768)` |
| 3 | 3-tensor | An RGB image, or a batch of token embeddings | `(224, 224, 3)` or `(32, 128, 768)` |
| 4 | 4-tensor | A batch of RGB images, a convolutional feature map | `(32, 224, 224, 3)` |
| 5 | 5-tensor | A batch of videos, volumetric medical data | `(8, 30, 224, 224, 3)` |
| 6 | 6-tensor | Batches of video clips with multiple camera views | `(N, V, T, H, W, C)` |

### Rank 0 through rank 3

A scalar is still a tensor object in modern frameworks, just one with an empty shape: `tf.constant(4)` produces a tensor with shape `()` and rank 0. A vector with `n` entries has shape `(n,)`, which is how word embeddings in [language models](/wiki/language_model), bias terms, and the output of a single softmax classifier are stored. Dense layer weights are rank 2 tensors of shape `(in_features, out_features)`. Grayscale images and tabular datasets are also rank 2. A single RGB image jumps to rank 3 with shape `(H, W, 3)` (channels last, the TensorFlow default) or `(3, H, W)` (channels first, the PyTorch default). Sequence models such as [Transformers](/wiki/transformer) take rank 3 inputs of shape `(batch, sequence_length, embedding_dim)`.

### Rank 4 and higher

For image classification, the workhorse input is a rank 4 tensor of shape `(N, H, W, C)` or `(N, C, H, W)`. A 2D convolutional layer expects this shape, and its weights are also rank 4 tensors of shape `(out_channels, in_channels, kernel_h, kernel_w)`. Video models add a temporal axis to reach rank 5 with shape `(N, T, H, W, C)`. 3D convolutions and volumetric medical imaging models that process MRI or CT volumes also operate on rank 5 tensors. Attention based architectures briefly produce rank 5 or rank 6 intermediates when they reshape into multi head form or split a sequence axis into blocks.

## How frameworks expose rank

Deep learning libraries all provide a quick way to query the rank of a tensor, though names vary.

| Library | Get rank as integer | Get rank as tensor | Get shape |
|---|---|---|---|
| NumPy | `a.ndim` | not applicable | `a.shape` |
| PyTorch | `t.ndim`, `t.dim()`, `t.ndimension()` | not applicable | `t.shape`, `t.size()` |
| TensorFlow | `t.ndim` | `tf.rank(t)` | `t.shape`, `tf.shape(t)` |
| JAX | `a.ndim` | not applicable | `a.shape` |

A short example for a tensor with shape `(3, 2, 4, 5)`:

```python
import numpy as np, torch, tensorflow as tf

np.zeros((3, 2, 4, 5)).ndim      # 4
torch.zeros(3, 2, 4, 5).ndim     # 4 (also .dim())
tf.zeros([3, 2, 4, 5]).ndim      # 4
tf.rank(tf.zeros([3, 2, 4, 5]))  # scalar Tensor with value 4
```

There is one small TensorFlow distinction worth noting. `t.ndim` returns a Python integer, while `tf.rank(t)` returns a scalar `Tensor`. The former is fine for control flow in eager mode; the latter is what you want inside a `tf.function` graph when the rank might not be known until runtime.

## Distinction from matrix rank

In [linear algebra](/wiki/linear_algebra), the **rank of a matrix** is the dimension of the vector space spanned by its rows (or equivalently its columns), bounded above by the smaller of the number of rows and columns. A 100 by 100 matrix has matrix rank at most 100, and might have rank 7 if only seven rows are linearly independent.

The tensor rank on this page is a completely different quantity. A matrix has tensor rank 2 simply because it is a 2D array. It can simultaneously have matrix rank 7, 99, or 100 depending on its entries. The matrix `numpy.zeros((100, 100))` has tensor rank 2 and matrix rank 0. The identity `numpy.eye(100)` has tensor rank 2 and matrix rank 100. When a linear algebra textbook says "the rank of `A`," it almost always means linearly independent rows or columns. When a TensorFlow or PyTorch tutorial says "the rank of `A`," it almost always means `ndim`.

## Distinction from tensor decomposition rank

Multilinear algebra extends matrix rank to tensors in a way that has nothing to do with `ndim`. The **CP rank** (canonical polyadic rank, also called tensor rank in the older mathematical literature) of a tensor is the minimum number of rank-1 tensors that sum to it. A rank-1 tensor is one that can be written as an outer product of vectors, for example `u \otimes v \otimes w` in the 3D case. The CP rank is the smallest number of terms needed.

A related idea is **Tucker rank** (also called n-rank or multilinear rank), which is a tuple rather than a single integer. The n-th entry of the Tucker rank is the matrix rank of the n-th mode unfolding of the tensor (flattening all axes except the n-th into a long matrix). Tucker decomposition gives per axis control over approximation quality, while CP uses a single scalar.

These are not the same idea as `ndim`. Computing the CP rank of a general tensor is **NP-hard**, while `ndim` is just the length of the shape tuple. The CP rank also depends on the underlying field: a real tensor can have a strictly larger CP rank over the reals than over the complex numbers, which never happens with matrix rank. For matrices, CP rank coincides with ordinary matrix rank, but for tensors of order 3 or higher the CP rank and the border rank can differ, a phenomenon with no matrix analogue.

In practice, CP and Tucker ranks show up in [tensor decomposition](/wiki/tensor_decomposition) methods used for model compression, recommendation systems, neural network weight factorization, and signal processing. When a paper says it uses a "low rank tensor approximation," it almost always means low CP or Tucker rank, not a low number of axes.

## Common operations that change rank

Many standard tensor operations change the rank by adding or removing axes.

| Operation | Effect on rank | Example |
|---|---|---|
| `reshape` | Rank can change to any compatible value | `(24,) -> (2, 3, 4)` goes from rank 1 to rank 3 |
| `squeeze` | Removes axes of length 1 | `(1, 3, 1, 5) -> (3, 5)` goes from rank 4 to rank 2 |
| `unsqueeze` or `expand_dims` | Adds an axis of length 1 | `(3, 5) -> (1, 3, 5)` goes from rank 2 to rank 3 |
| `flatten` | Collapses several axes into one | `(2, 3, 4) -> (24,)` goes from rank 3 to rank 1 |
| Reduction like `sum(axis=k)` | Removes one axis | `(2, 3, 4).sum(axis=1) -> (2, 4)` |
| Stacking | Adds one axis | `stack([(3,), (3,), (3,)]) -> (3, 3)` |

Adding a missing batch dimension with `unsqueeze(0)` or `tf.expand_dims(x, axis=0)` is so common that it has become a reflex for many practitioners.

## A note on terminology

Several words are used for what this article calls rank, and all of them appear in active code and papers: *rank* (common in TensorFlow), *order* (more common in physics and older mathematics literature), *degree* (occasionally seen in tensor calculus), *number of dimensions* or *ndim* (common in NumPy and PyTorch), and *number of axes* (used in the official TensorFlow tensor guide). When a tensor has shape `(2, 3, 4)`, someone might describe it as "rank 3," "three dimensional," "a 3-tensor," "order 3," or simply as having "three axes." These are all the same statement. To avoid confusion with matrix rank, some authors prefer the word *order* when they want to be unambiguous, especially in papers that also discuss CP or Tucker rank.

## Practical tips for debugging rank

A mismatch between expected and actual rank is one of the most common causes of beginner errors in deep learning. A few habits help. Print the rank and shape of every input tensor when a layer raises a shape mismatch error, since most framework error messages mention the expected rank in their text. Add explicit `assert tensor.ndim == k` checks in custom training loops, especially around the boundary between dataloader output and model input. Consider `einops` style notation such as `rearrange(x, 'b h w c -> b c h w')` for transposes and reshapes, since the pattern strings document the rank and meaning of each axis in code. Remember that `(N, 1)` and `(N,)` look the same when printed but have different ranks, and operations like binary cross entropy and softmax are sensitive to this difference.

## Explain like I'm 5 (ELI5)

Imagine boxes you need to organize. One box on its own is like a single number. A row of boxes is like a list. Stack rows into a wall and you get a table. Stack walls together and you have a cube, and you need three pieces of information (row, column, and layer) to point at any one box inside it. The rank of a tensor is just the number of directions you have to follow to find a particular box. In machine learning, tensors help organize huge amounts of information in a way that computers can process quickly during training and inference.

## References

1. [Introduction to Tensors, TensorFlow Core Guide](https://www.tensorflow.org/guide/tensor)
2. [tf.rank, TensorFlow API Documentation](https://www.tensorflow.org/api_docs/python/tf/rank)
3. [numpy.ndarray.ndim, NumPy Documentation](https://numpy.org/doc/stable/reference/generated/numpy.ndarray.ndim.html)
4. [Tensor rank decomposition, Wikipedia](https://en.wikipedia.org/wiki/Tensor_rank_decomposition)
5. [TensorFlow Tensor Ranks, Shapes, and Types (r0.7 docs)](https://chromium.googlesource.com/external/github.com/tensorflow/tensorflow/+/r0.7/tensorflow/g3doc/resources/dims_types.md)
6. [Tensor Rank VS Matrix Rank, Lei Mao's Log Book](https://leimao.github.io/blog/Tensor-Rank-VS-Matrix-Rank/)
7. [Rank, Axes, and Shape Explained, deeplizard](https://deeplizard.com/learn/video/AiyK0idr4uM)
8. Goodfellow, Bengio, and Courville. *Deep Learning*. MIT Press, 2016. [Online edition](https://www.deeplearningbook.org/)
9. Kolda, T. G., and Bader, B. W. (2009). "Tensor Decompositions and Applications." *SIAM Review* 51(3): 455-500.

