# Rank (Tensor)

> Source: https://aiwiki.ai/wiki/rank_tensor
> Updated: 2026-06-29
> Categories: Machine Learning
> License: CC BY 4.0 (https://creativecommons.org/licenses/by/4.0/)
> From AI Wiki (https://aiwiki.ai), the free encyclopedia of artificial intelligence. Reuse freely with attribution to "AI Wiki (aiwiki.ai)".

*See also: [Machine learning terms](/wiki/machine_learning_terms)*

In machine learning and deep learning frameworks, the **rank of a [tensor](/wiki/tensor)** is the number of dimensions (axes) it has: the count of indices you must supply to pick out a single scalar element. A [scalar](/wiki/scalar) is rank 0, a [vector](/wiki/vector) is rank 1, a [matrix](/wiki/matrix) is rank 2, and an array with n axes is rank n. The official TensorFlow guide defines it directly: "Rank: Number of tensor axes. A scalar has rank 0, a vector has rank 1, a matrix is rank 2." [1] In code this is the integer returned by [NumPy](/wiki/numpy) `ndarray.ndim`, by [PyTorch](/wiki/pytorch) `Tensor.ndim` or `Tensor.dim()`, and by [TensorFlow](/wiki/tensorflow) `tf.rank`. [1][2][3]

This framework sense of rank, the number of axes, is different from two other quantities that share the name: the [linear algebra](/wiki/linear_algebra) **rank of a matrix** (the number of linearly independent rows or columns) and the **tensor decomposition rank** ([CP rank](/wiki/cp_decomposition), the minimum number of rank-1 tensors that sum to it). TensorFlow's own API reference states the warning plainly: "The rank of a tensor is not the same as the rank of a matrix." [2] This page covers the framework meaning (the number of axes) and the disambiguation in detail below.

## Introduction

In machine learning and deep learning, the **rank of a [tensor](/wiki/tensor)** is the number of dimensions (also called axes) the tensor has. It is the count of indices you need to specify in order to pick out a single scalar element. A scalar is rank 0, a vector is rank 1, a matrix is rank 2, and so on. This sense of rank is sometimes called the *order* or *degree* of the tensor, and in practical code it is what NumPy exposes as `ndarray.ndim`, what PyTorch exposes as `Tensor.ndim` or `Tensor.dim()`, and what TensorFlow returns from `tf.rank`. [1][2][3]

This definition is the one used almost universally in [neural network](/wiki/neural_network) frameworks and tutorials. It is, however, distinct from a different and older notion of tensor rank from multilinear algebra (the minimum number of rank-1 terms needed to write the tensor as a sum, also known as the CP rank). The two ideas share a name but mean different things, and confusion between them is one of the more common terminology traps in the field.

## What is the rank of a tensor?

A tensor is a generalization of scalars, vectors, and matrices. In deep learning libraries, the word *tensor* is essentially a synonym for a multidimensional array of numerical values that lives in CPU or GPU memory, usually as 32 bit or 16 bit floats, sometimes as integers, booleans, or complex numbers.

The rank of a tensor in this dimensional sense is simply the length of its shape tuple. If a tensor has shape `(2, 3, 4)`, then its rank is 3, because the shape has three entries. Each entry in the shape gives the length of one axis, and the product of those lengths is the total number of scalar elements stored. The TensorFlow guide frames the related vocabulary the same way: shape is "the length (number of elements) of each of the axes of a tensor," and size is "the total number of items in the tensor, the product of the shape vector's elements." [1]

A useful way to think about it: rank answers the question "how many indices do I need to fully specify one element?" The TensorFlow reference puts it identically, describing rank as "the number of indices required to uniquely select each element of the tensor." [2] For a matrix `M` you write `M[i, j]`, so the rank is 2. For a 3D tensor `T` you write `T[i, j, k]`, so the rank is 3. For a scalar `s` you need no indices, so the rank is 0. As a concrete check, TensorFlow notes that for a tensor `t` of shape `[2, 2, 3]`, `tf.rank(t)` returns 3. [2]

## Rank, axes, and shape

Four related terms appear together often enough to deserve a side by side comparison.

| Term | What it is | Example for shape (32, 64, 128) |
|---|---|---|
| Rank | Number of axes (a single integer) | 3 |
| Axes | The individual dimensions, indexed 0, 1, 2, ... | axis 0, axis 1, axis 2 |
| Shape | Tuple giving the length of each axis | (32, 64, 128) |
| Size | Total scalar elements (product of shape) | 262,144 |

The rank is the *count* of dimensions; the shape is the *list of sizes*. A tensor of shape `(5,)` has rank 1, while a tensor of shape `(5, 1)` has rank 2. The two look similar on paper but they are different objects, and operations that depend on broadcasting or matrix multiplication treat them differently.

## What ranks are common in machine learning?

Most real workloads use tensors with rank between 0 and 5. Higher ranks appear in research but are less common in everyday training code.

| Rank | Common name | Typical example | Example shape |
|---|---|---|---|
| 0 | Scalar | A loss value, learning rate, prediction probability | `()` |
| 1 | Vector | A feature vector, embedding, or 1D signal | `(768,)` |
| 2 | Matrix | Weight matrix, grayscale image, batch of feature vectors | `(32, 768)` |
| 3 | 3-tensor | An RGB image, or a batch of token embeddings | `(224, 224, 3)` or `(32, 128, 768)` |
| 4 | 4-tensor | A batch of RGB images, a convolutional feature map | `(32, 224, 224, 3)` |
| 5 | 5-tensor | A batch of videos, volumetric medical data | `(8, 30, 224, 224, 3)` |
| 6 | 6-tensor | Batches of video clips with multiple camera views | `(N, V, T, H, W, C)` |

### Rank 0 through rank 3

A scalar is still a tensor object in modern frameworks, just one with an empty shape: `tf.constant(4)` produces a tensor with shape `()` and rank 0. A vector with `n` entries has shape `(n,)`, which is how word embeddings in [language models](/wiki/language_model), bias terms, and the output of a single softmax classifier are stored. Dense layer weights are rank 2 tensors of shape `(in_features, out_features)`. Grayscale images and tabular datasets are also rank 2. A single RGB image jumps to rank 3 with shape `(H, W, 3)` (channels last, the TensorFlow default) or `(3, H, W)` (channels first, the PyTorch default). Sequence models such as [Transformers](/wiki/transformer) take rank 3 inputs of shape `(batch, sequence_length, embedding_dim)`.

### Rank 4 and higher

For image classification, the workhorse input is a rank 4 tensor of shape `(N, H, W, C)` or `(N, C, H, W)`. A 2D convolutional layer expects this shape, and its weights are also rank 4 tensors of shape `(out_channels, in_channels, kernel_h, kernel_w)`. Video models add a temporal axis to reach rank 5 with shape `(N, T, H, W, C)`. 3D convolutions and volumetric medical imaging models that process MRI or CT volumes also operate on rank 5 tensors. Attention based architectures briefly produce rank 5 or rank 6 intermediates when they reshape into multi head form or split a sequence axis into blocks.

## How do you get a tensor's rank in code?

Deep learning libraries all provide a quick way to query the rank of a tensor, though names vary. NumPy reports it as `ndim`, documented simply as the "Number of array dimensions." [3]

| Library | Get rank as integer | Get rank as tensor | Get shape |
|---|---|---|---|
| NumPy | `a.ndim` | not applicable | `a.shape` |
| PyTorch | `t.ndim`, `t.dim()`, `t.ndimension()` | not applicable | `t.shape`, `t.size()` |
| TensorFlow | `t.ndim` | `tf.rank(t)` | `t.shape`, `tf.shape(t)` |
| JAX | `a.ndim` | not applicable | `a.shape` |

A short example for a tensor with shape `(3, 2, 4, 5)`:

```python
import numpy as np, torch, tensorflow as tf

np.zeros((3, 2, 4, 5)).ndim      # 4
torch.zeros(3, 2, 4, 5).ndim     # 4 (also .dim())
tf.zeros([3, 2, 4, 5]).ndim      # 4
tf.rank(tf.zeros([3, 2, 4, 5]))  # scalar Tensor with value 4
```

There is one small TensorFlow distinction worth noting. `t.ndim` returns a Python integer, while `tf.rank(t)` returns a scalar `Tensor` (specifically a 0-D int32 `Tensor`). [2] The former is fine for control flow in eager mode; the latter is what you want inside a `tf.function` graph when the rank might not be known until runtime. The TensorFlow guide makes the same point: the `Tensor.ndim` and `Tensor.shape` attributes "don't return `Tensor` objects. If you need a `Tensor` use the `tf.rank` or `tf.shape` function." [1]

## How is tensor rank different from matrix rank?

In linear algebra, the **rank of a matrix** is the dimension of the vector space spanned by its rows (or equivalently its columns), bounded above by the smaller of the number of rows and columns. A 100 by 100 matrix has matrix rank at most 100, and might have rank 7 if only seven rows are linearly independent.

The tensor rank on this page is a completely different quantity, and TensorFlow flags the clash explicitly: "The rank of a tensor is not the same as the rank of a matrix." [2] A matrix has tensor rank 2 simply because it is a 2D array. It can simultaneously have matrix rank 7, 99, or 100 depending on its entries. The matrix `numpy.zeros((100, 100))` has tensor rank 2 and matrix rank 0. The identity `numpy.eye(100)` has tensor rank 2 and matrix rank 100. When a linear algebra textbook says "the rank of `A`," it almost always means linearly independent rows or columns. When a TensorFlow or PyTorch tutorial says "the rank of `A`," it almost always means `ndim`.

## How does it differ from tensor decomposition rank (CP rank)?

Multilinear algebra extends matrix rank to tensors in a way that has nothing to do with `ndim`. The **CP rank** (canonical polyadic rank, also called tensor rank in the older mathematical literature) of a tensor is the minimum number of rank-1 tensors that sum to it. A rank-1 tensor is one that can be written as an outer product of vectors, for example `u \otimes v \otimes w` in the 3D case. The CP rank is the smallest number of terms needed. [9]

A related idea is **Tucker rank** (also called n-rank or multilinear rank), which is a tuple rather than a single integer. The n-th entry of the Tucker rank is the matrix rank of the n-th mode unfolding of the tensor (flattening all axes except the n-th into a long matrix). Tucker decomposition gives per axis control over approximation quality, while CP uses a single scalar. [9]

These are not the same idea as `ndim`. Computing the CP rank of a general tensor is **NP-hard**: Johan Hastad proved that tensor rank is NP-hard over the rationals and NP-complete over finite fields, and it remains NP-hard over both the reals and the complex numbers. [4] By contrast, `ndim` is just the length of the shape tuple, an O(1) lookup. The CP rank also depends on the underlying field: a real tensor can have a strictly larger CP rank over the reals than over the complex numbers, a phenomenon that never happens with matrix rank. [4] For matrices, CP rank coincides with ordinary matrix rank, but for tensors of order 3 or higher the CP rank and the border rank can differ, with no matrix analogue.

In practice, CP and Tucker ranks show up in [tensor decomposition](/wiki/tensor_decomposition) methods used for model compression, recommendation systems, neural network weight factorization, and signal processing. When a paper says it uses a "low rank tensor approximation," it almost always means low CP or Tucker rank, not a low number of axes.

## What operations change a tensor's rank?

Many standard tensor operations change the rank by adding or removing axes.

| Operation | Effect on rank | Example |
|---|---|---|
| `reshape` | Rank can change to any compatible value | `(24,) -> (2, 3, 4)` goes from rank 1 to rank 3 |
| `squeeze` | Removes axes of length 1 | `(1, 3, 1, 5) -> (3, 5)` goes from rank 4 to rank 2 |
| `unsqueeze` or `expand_dims` | Adds an axis of length 1 | `(3, 5) -> (1, 3, 5)` goes from rank 2 to rank 3 |
| `flatten` | Collapses several axes into one | `(2, 3, 4) -> (24,)` goes from rank 3 to rank 1 |
| Reduction like `sum(axis=k)` | Removes one axis | `(2, 3, 4).sum(axis=1) -> (2, 4)` |
| Stacking | Adds one axis | `stack([(3,), (3,), (3,)]) -> (3, 3)` |

Adding a missing batch dimension with `unsqueeze(0)` or `tf.expand_dims(x, axis=0)` is so common that it has become a reflex for many practitioners.

## Why are there so many words for the same thing?

Several words are used for what this article calls rank, and all of them appear in active code and papers: *rank* (common in TensorFlow), *order* (more common in physics and older mathematics literature), *degree* (occasionally seen in tensor calculus), *number of dimensions* or *ndim* (common in NumPy and PyTorch), and *number of axes* (used in the official TensorFlow tensor guide). TensorFlow's API reference confirms the overlap directly, noting that rank "is also known as 'order', 'degree', or 'ndims.'" [2] When a tensor has shape `(2, 3, 4)`, someone might describe it as "rank 3," "three dimensional," "a 3-tensor," "order 3," or simply as having "three axes." These are all the same statement. To avoid confusion with matrix rank, some authors prefer the word *order* when they want to be unambiguous, especially in papers that also discuss CP or Tucker rank.

## How do you debug a rank mismatch?

A mismatch between expected and actual rank is one of the most common causes of beginner errors in deep learning. A few habits help. Print the rank and shape of every input tensor when a layer raises a shape mismatch error, since most framework error messages mention the expected rank in their text. Add explicit `assert tensor.ndim == k` checks in custom training loops, especially around the boundary between dataloader output and model input. Consider `einops` style notation such as `rearrange(x, 'b h w c -> b c h w')` for transposes and reshapes, since the pattern strings document the rank and meaning of each axis in code. Remember that `(N, 1)` and `(N,)` look the same when printed but have different ranks, and operations like binary cross entropy and softmax are sensitive to this difference.

## Explain like I'm 5 (ELI5)

Imagine boxes you need to organize. One box on its own is like a single number. A row of boxes is like a list. Stack rows into a wall and you get a table. Stack walls together and you have a cube, and you need three pieces of information (row, column, and layer) to point at any one box inside it. The rank of a tensor is just the number of directions you have to follow to find a particular box. In machine learning, tensors help organize huge amounts of information in a way that computers can process quickly during training and inference.

## References

1. [Introduction to Tensors, TensorFlow Core Guide](https://www.tensorflow.org/guide/tensor)
2. [tf.rank, TensorFlow API Documentation](https://www.tensorflow.org/api_docs/python/tf/rank)
3. [numpy.ndarray.ndim, NumPy Documentation](https://numpy.org/doc/stable/reference/generated/numpy.ndarray.ndim.html)
4. [Tensor rank decomposition, Wikipedia](https://en.wikipedia.org/wiki/Tensor_rank_decomposition)
5. [TensorFlow Tensor Ranks, Shapes, and Types (r0.7 docs)](https://chromium.googlesource.com/external/github.com/tensorflow/tensorflow/+/r0.7/tensorflow/g3doc/resources/dims_types.md)
6. [Tensor Rank VS Matrix Rank, Lei Mao's Log Book](https://leimao.github.io/blog/Tensor-Rank-VS-Matrix-Rank/)
7. [Rank, Axes, and Shape Explained, deeplizard](https://deeplizard.com/learn/video/AiyK0idr4uM)
8. Goodfellow, Bengio, and Courville. *Deep Learning*. MIT Press, 2016. [Online edition](https://www.deeplearningbook.org/)
9. Kolda, T. G., and Bader, B. W. (2009). "Tensor Decompositions and Applications." *SIAM Review* 51(3): 455-500. [Online](https://epubs.siam.org/doi/10.1137/07070111X)