Rank (Tensor)
Last reviewed
May 11, 2026
Sources
9 citations
Review status
Source-backed
Revision
v2 · 2,178 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
May 11, 2026
Sources
9 citations
Review status
Source-backed
Revision
v2 · 2,178 words
Add missing citations, update stale details, or suggest a clearer explanation.
See also: Machine learning terms
In machine learning and deep learning, the rank of a tensor is the number of dimensions (also called axes) the tensor has. It is the count of indices you need to specify in order to pick out a single scalar element. A scalar is rank 0, a vector is rank 1, a matrix is rank 2, and so on. This sense of rank is sometimes called the order or degree of the tensor, and in practical code it is what NumPy exposes as ndarray.ndim, what PyTorch exposes as Tensor.ndim or Tensor.dim(), and what TensorFlow returns from tf.rank.
This definition is the one used almost universally in neural network frameworks and tutorials. It is, however, distinct from a different and older notion of tensor rank from multilinear algebra (the minimum number of rank-1 terms needed to write the tensor as a sum, also known as the CP rank). The two ideas share a name but mean different things, and confusion between them is one of the more common terminology traps in the field.
A tensor is a generalization of scalars, vectors, and matrices. In deep learning libraries, the word tensor is essentially a synonym for a multidimensional array of numerical values that lives in CPU or GPU memory, usually as 32 bit or 16 bit floats, sometimes as integers, booleans, or complex numbers.
The rank of a tensor in this dimensional sense is simply the length of its shape tuple. If a tensor has shape (2, 3, 4), then its rank is 3, because the shape has three entries. Each entry in the shape gives the length of one axis, and the product of those lengths is the total number of scalar elements stored.
A useful way to think about it: rank answers the question "how many indices do I need to fully specify one element?" For a matrix M you write M[i, j], so the rank is 2. For a 3D tensor T you write T[i, j, k], so the rank is 3. For a scalar s you need no indices, so the rank is 0.
Four related terms appear together often enough to deserve a side by side comparison.
| Term | What it is | Example for shape (32, 64, 128) |
|---|---|---|
| Rank | Number of axes (a single integer) | 3 |
| Axes | The individual dimensions, indexed 0, 1, 2, ... | axis 0, axis 1, axis 2 |
| Shape | Tuple giving the length of each axis | (32, 64, 128) |
| Size | Total scalar elements (product of shape) | 262,144 |
The rank is the count of dimensions; the shape is the list of sizes. A tensor of shape (5,) has rank 1, while a tensor of shape (5, 1) has rank 2. The two look similar on paper but they are different objects, and operations that depend on broadcasting or matrix multiplication treat them differently.
Most real workloads use tensors with rank between 0 and 5. Higher ranks appear in research but are less common in everyday training code.
| Rank | Common name | Typical example | Example shape |
|---|---|---|---|
| 0 | Scalar | A loss value, learning rate, prediction probability | () |
| 1 | Vector | A feature vector, embedding, or 1D signal | (768,) |
| 2 | Matrix | Weight matrix, grayscale image, batch of feature vectors | (32, 768) |
| 3 | 3-tensor | An RGB image, or a batch of token embeddings | (224, 224, 3) or (32, 128, 768) |
| 4 | 4-tensor | A batch of RGB images, a convolutional feature map | (32, 224, 224, 3) |
| 5 | 5-tensor | A batch of videos, volumetric medical data | (8, 30, 224, 224, 3) |
| 6 | 6-tensor | Batches of video clips with multiple camera views | (N, V, T, H, W, C) |
A scalar is still a tensor object in modern frameworks, just one with an empty shape: tf.constant(4) produces a tensor with shape () and rank 0. A vector with n entries has shape (n,), which is how word embeddings in language models, bias terms, and the output of a single softmax classifier are stored. Dense layer weights are rank 2 tensors of shape (in_features, out_features). Grayscale images and tabular datasets are also rank 2. A single RGB image jumps to rank 3 with shape (H, W, 3) (channels last, the TensorFlow default) or (3, H, W) (channels first, the PyTorch default). Sequence models such as Transformers take rank 3 inputs of shape (batch, sequence_length, embedding_dim).
For image classification, the workhorse input is a rank 4 tensor of shape (N, H, W, C) or (N, C, H, W). A 2D convolutional layer expects this shape, and its weights are also rank 4 tensors of shape (out_channels, in_channels, kernel_h, kernel_w). Video models add a temporal axis to reach rank 5 with shape (N, T, H, W, C). 3D convolutions and volumetric medical imaging models that process MRI or CT volumes also operate on rank 5 tensors. Attention based architectures briefly produce rank 5 or rank 6 intermediates when they reshape into multi head form or split a sequence axis into blocks.
Deep learning libraries all provide a quick way to query the rank of a tensor, though names vary.
| Library | Get rank as integer | Get rank as tensor | Get shape |
|---|---|---|---|
| NumPy | a.ndim | not applicable | a.shape |
| PyTorch | t.ndim, t.dim(), t.ndimension() | not applicable | t.shape, t.size() |
| TensorFlow | t.ndim | tf.rank(t) | t.shape, tf.shape(t) |
| JAX | a.ndim | not applicable | a.shape |
A short example for a tensor with shape (3, 2, 4, 5):
import numpy as np, torch, tensorflow as tf
np.zeros((3, 2, 4, 5)).ndim # 4
torch.zeros(3, 2, 4, 5).ndim # 4 (also .dim())
tf.zeros([3, 2, 4, 5]).ndim # 4
tf.rank(tf.zeros([3, 2, 4, 5])) # scalar Tensor with value 4
There is one small TensorFlow distinction worth noting. t.ndim returns a Python integer, while tf.rank(t) returns a scalar Tensor. The former is fine for control flow in eager mode; the latter is what you want inside a tf.function graph when the rank might not be known until runtime.
In linear algebra, the rank of a matrix is the dimension of the vector space spanned by its rows (or equivalently its columns), bounded above by the smaller of the number of rows and columns. A 100 by 100 matrix has matrix rank at most 100, and might have rank 7 if only seven rows are linearly independent.
The tensor rank on this page is a completely different quantity. A matrix has tensor rank 2 simply because it is a 2D array. It can simultaneously have matrix rank 7, 99, or 100 depending on its entries. The matrix numpy.zeros((100, 100)) has tensor rank 2 and matrix rank 0. The identity numpy.eye(100) has tensor rank 2 and matrix rank 100. When a linear algebra textbook says "the rank of A," it almost always means linearly independent rows or columns. When a TensorFlow or PyTorch tutorial says "the rank of A," it almost always means ndim.
Multilinear algebra extends matrix rank to tensors in a way that has nothing to do with ndim. The CP rank (canonical polyadic rank, also called tensor rank in the older mathematical literature) of a tensor is the minimum number of rank-1 tensors that sum to it. A rank-1 tensor is one that can be written as an outer product of vectors, for example u \otimes v \otimes w in the 3D case. The CP rank is the smallest number of terms needed.
A related idea is Tucker rank (also called n-rank or multilinear rank), which is a tuple rather than a single integer. The n-th entry of the Tucker rank is the matrix rank of the n-th mode unfolding of the tensor (flattening all axes except the n-th into a long matrix). Tucker decomposition gives per axis control over approximation quality, while CP uses a single scalar.
These are not the same idea as ndim. Computing the CP rank of a general tensor is NP-hard, while ndim is just the length of the shape tuple. The CP rank also depends on the underlying field: a real tensor can have a strictly larger CP rank over the reals than over the complex numbers, which never happens with matrix rank. For matrices, CP rank coincides with ordinary matrix rank, but for tensors of order 3 or higher the CP rank and the border rank can differ, a phenomenon with no matrix analogue.
In practice, CP and Tucker ranks show up in tensor decomposition methods used for model compression, recommendation systems, neural network weight factorization, and signal processing. When a paper says it uses a "low rank tensor approximation," it almost always means low CP or Tucker rank, not a low number of axes.
Many standard tensor operations change the rank by adding or removing axes.
| Operation | Effect on rank | Example |
|---|---|---|
reshape | Rank can change to any compatible value | (24,) -> (2, 3, 4) goes from rank 1 to rank 3 |
squeeze | Removes axes of length 1 | (1, 3, 1, 5) -> (3, 5) goes from rank 4 to rank 2 |
unsqueeze or expand_dims | Adds an axis of length 1 | (3, 5) -> (1, 3, 5) goes from rank 2 to rank 3 |
flatten | Collapses several axes into one | (2, 3, 4) -> (24,) goes from rank 3 to rank 1 |
Reduction like sum(axis=k) | Removes one axis | (2, 3, 4).sum(axis=1) -> (2, 4) |
| Stacking | Adds one axis | stack([(3,), (3,), (3,)]) -> (3, 3) |
Adding a missing batch dimension with unsqueeze(0) or tf.expand_dims(x, axis=0) is so common that it has become a reflex for many practitioners.
Several words are used for what this article calls rank, and all of them appear in active code and papers: rank (common in TensorFlow), order (more common in physics and older mathematics literature), degree (occasionally seen in tensor calculus), number of dimensions or ndim (common in NumPy and PyTorch), and number of axes (used in the official TensorFlow tensor guide). When a tensor has shape (2, 3, 4), someone might describe it as "rank 3," "three dimensional," "a 3-tensor," "order 3," or simply as having "three axes." These are all the same statement. To avoid confusion with matrix rank, some authors prefer the word order when they want to be unambiguous, especially in papers that also discuss CP or Tucker rank.
A mismatch between expected and actual rank is one of the most common causes of beginner errors in deep learning. A few habits help. Print the rank and shape of every input tensor when a layer raises a shape mismatch error, since most framework error messages mention the expected rank in their text. Add explicit assert tensor.ndim == k checks in custom training loops, especially around the boundary between dataloader output and model input. Consider einops style notation such as rearrange(x, 'b h w c -> b c h w') for transposes and reshapes, since the pattern strings document the rank and meaning of each axis in code. Remember that (N, 1) and (N,) look the same when printed but have different ranks, and operations like binary cross entropy and softmax are sensitive to this difference.
Imagine boxes you need to organize. One box on its own is like a single number. A row of boxes is like a list. Stack rows into a wall and you get a table. Stack walls together and you have a cube, and you need three pieces of information (row, column, and layer) to point at any one box inside it. The rank of a tensor is just the number of directions you have to follow to find a particular box. In machine learning, tensors help organize huge amounts of information in a way that computers can process quickly during training and inference.