TensorFlow.js

TensorFlow.js (often abbreviated TFJS) is an open-source JavaScript library developed by Google for training and running machine-learning models directly in the web browser, in Node.js, and on mobile through React Native. It is a member of the broader TensorFlow ecosystem and provides JavaScript-native APIs that mirror the Python TensorFlow and Keras APIs. Because computation runs on the user's device, TensorFlow.js eliminates server round-trips, keeps user data local, and lowers inference latency for interactive applications such as webcam pose tracking, browser games, and on-device personalisation.

The library was officially announced on March 30, 2018 at the TensorFlow Developer Summit, where Google demonstrated a webcam-controlled PAC-MAN trained entirely in the browser. It is the successor to deeplearn.js, an earlier WebGL-accelerated JavaScript library released in August 2017 by the Google PAIR (People + AI Research) team. TensorFlow.js is distributed under the Apache 2.0 license and is published as the npm package @tensorflow/tfjs, with separate packages for individual backends, the model converter, the layers API, and platform adapters.

Definition and goals

TensorFlow.js is a library for defining tensors, building neural networks, training them with automatic differentiation, and running inference in JavaScript runtimes. Its design goals, set out in the foundational MLSys 2019 paper by Smilkov et al., are: full GPU acceleration via the web stack, parity with the Python TensorFlow API where reasonable, the ability to import models trained in Python or other tools, and a compact bundle suitable for shipping over the web.

The library targets four main runtimes:

The web browser, where models can run on the user's GPU through WebGL or WebGPU, on the CPU through plain JavaScript, or through compiled WebAssembly (WASM) kernels.
Node.js servers, where bindings to the native TensorFlow C library provide the same speed as Python TensorFlow on CPU and CUDA GPUs.
React Native applications, where an expo-gl adapter exposes the device GPU.
Other JavaScript hosts such as Electron desktops and edge runtimes that ship a browser engine.

In each case the developer writes the same TensorFlow.js code; the library selects the fastest available backend at startup.

History

Deeplearn.js (2017)

The direct predecessor of TensorFlow.js was deeplearn.js, a WebGL-accelerated JavaScript library announced by Google researchers Daniel Smilkov and Nikhil Thorat on August 11, 2017. Smilkov and Thorat were members of Google's PAIR (People + AI Research) initiative, which had previously produced the TensorFlow Playground, an interactive in-browser visualisation of small neural networks. Deeplearn.js generalised the techniques used in the Playground into a full library: tensors stored as WebGL textures, fragment shaders implementing element-wise and matrix operations, and a NumPy-style API with eager execution and automatic differentiation.

Deeplearn.js was aimed at visual demos and education. Early projects shipped in deeplearn.js included style transfer, image-to-image translation, and an in-browser implementation of SqueezeNet. Although the library was popular, it lacked official ties to TensorFlow's Python ecosystem, and its API diverged from the Keras conventions familiar to most ML practitioners.

Launch as TensorFlow.js (2018)

On March 30, 2018, Google relaunched deeplearn.js as TensorFlow.js at the TensorFlow Developer Summit. The relaunch added a Keras-compatible Layers API on top of the existing low-level Core API, plus a converter that imports TensorFlow SavedModel and Keras HDF5 files, and renamed deeplearn.js itself to TensorFlow.js Core. The summit talk by Smilkov and Thorat featured a live demo of training a PAC-MAN controller from a webcam feed, showing that the browser could now do real backpropagation, not just inference.

The launch also introduced the Emoji Scavenger Hunt mobile demo, which used the device camera to detect real-world objects matching emoji prompts. Both demos ran entirely in the browser, with no server-side ML.

Continued development (2018 to 2024)

After the launch, TensorFlow.js development tracked the broader TensorFlow release cadence and added several major capabilities:

Node.js bindings (2018): the @tensorflow/tfjs-node and @tensorflow/tfjs-node-gpu packages exposed native TensorFlow C++ kernels to JavaScript on servers, with optional CUDA acceleration on NVIDIA GPUs.
React Native adapter (February 2020): the @tensorflow/tfjs-react-native package brought GPU-accelerated TFJS to mobile apps via Expo's expo-gl and expo-gl-cpp.
WebAssembly backend (March 2020): a WASM backend compiled the XNNPACK kernel library to WebAssembly, giving CPU inference a large speedup over plain JavaScript. With SIMD and multithreading added in version 2.3.0 (September 2020), the WASM backend became up to 10x faster than the plain JS backend.
WebGPU preview (2023): Google Chrome shipped WebGPU by default in Chrome 113 on May 2, 2023, and TensorFlow.js released a preview WebGPU backend that uses compute shaders rather than the older fragment-shader trick used by the WebGL backend.
WebGPU stabilisation (2024): the WebGPU backend matured through 2024 alongside Chromium's WebGPU API, with continued kernel coverage and benchmark improvements.
Version 4.22.0 (October 21, 2024): the most recent release of the main @tensorflow/tfjs umbrella package as of this writing, with TypeScript as the dominant source language for the project.

In parallel, Google's broader on-device ML stack evolved. The team rebranded TensorFlow Lite as LiteRT in 2024 and introduced LiteRT.js, a Web AI inference path that targets browser deployment of LiteRT (.tflite) models. LiteRT.js can be used alongside TensorFlow.js or as an alternative for browsers that benefit from TFLite's quantisation tooling.

Architecture

TensorFlow.js is structured as a small core library plus optional backend modules and higher-level APIs. The pieces fit together in three layers.

Core API

The Core API (@tensorflow/tfjs-core) provides low-level tensor operations, similar to NumPy with automatic differentiation added on top. Tensors are immutable and typed; ops include tf.matMul, tf.conv2d, tf.add, tf.softmax, and several hundred others. Eager execution is the default, meaning each op runs immediately when called. Automatic differentiation is provided through tf.grad and tf.variableGrads, allowing custom training loops and gradient-based optimisation outside the high-level API.

The Core API also exposes memory management primitives. Because WebGL textures live outside the JavaScript garbage collector, developers wrap operations in tf.tidy() to release intermediate tensors automatically, or call dispose() manually on long-lived tensors.

Layers API

The Layers API (@tensorflow/tfjs-layers) is a Keras-compatible high-level API for building, training, and saving neural networks. It supports tf.sequential() for stacked-layer models and tf.model() for arbitrary directed graphs, plus model.compile, model.fit, model.evaluate, and model.predict matching Python Keras conventions. Layers include dense, convolutional, recurrent (LSTM, GRU), normalisation, dropout, and embedding variants. Models written with the Layers API can be saved as a model.json topology file plus binary weight shards, then loaded in any TFJS environment.

Backends

A backend is a TensorFlow.js plugin that implements the actual numeric kernels. The library can be initialised with a specific backend or it will pick the fastest available one. The table below summarises the main backends and where each is used.

Backend	Runtime	Acceleration	Typical use
WebGL	Browser	GPU via WebGL 2.0 fragment shaders	Default browser backend; best for medium and large CNNs
WebGPU	Browser	GPU via WebGPU compute shaders	Newer browsers; targets best-of-class performance
WASM	Browser, Node.js	CPU via compiled XNNPACK kernels with SIMD and threads	Small or quantised models; fallback when GPU is unavailable
CPU (Plain JS)	Browser, Node.js	Pure JavaScript	Universal fallback; slow but always available
Node.js (CPU)	Node.js	Native TensorFlow C++ kernels	Server-side training and inference
Node.js (GPU)	Node.js	Native TensorFlow C++ + CUDA	Server-side training on NVIDIA GPUs
React Native	Mobile	GPU via `expo-gl` (WebGL)	iOS and Android apps

WebGL backend

The WebGL backend is the original GPU path inherited from deeplearn.js. Tensors are stored as WebGL textures and ops are implemented as fragment shaders that read input textures and write output textures. The TensorFlow.js team has reported the WebGL backend running up to roughly 100x faster than the plain JS CPU backend on common CNNs, with most of the time spent in texture fetches rather than floating-point math. Shader programs are compiled lazily on first use and cached, so warm-up cost is paid once per shape.

WebGPU backend

The WebGPU backend uses the newer WebGPU API, which exposes compute shaders and explicit GPU memory management instead of repurposing the graphics pipeline. The package @tensorflow/tfjs-backend-webgpu requires a browser with WebGPU enabled. Reports from the TensorFlow.js team indicate WebGPU can deliver around 3x faster inference than WebGL for typical ML workloads, with sub-30 ms response times on small and mid-sized models.

WASM backend

The WASM backend (@tensorflow/tfjs-backend-wasm) compiles Google's XNNPACK kernel library to WebAssembly. As of TensorFlow.js 2.3.0, the WASM backend gained SIMD vector instructions and multithreading support, making it roughly 10x faster than the plain JS backend on common models. WASM is on par with or faster than WebGL for small models such as MediaPipe BlazeFace and FaceMesh because it avoids the fixed overhead of compiling shaders. For medium and large CNNs like MobileNet and BodyPix, WebGL is generally 2x to 4x faster than WASM.

CPU backend

The CPU backend is a pure JavaScript implementation, included as a universal fallback when neither WebGL, WebGPU, nor WASM is available. It is slow but works in any JavaScript host.

Node.js backend

The Node.js backends bind directly to TensorFlow's C++ runtime through Node native add-ons. The CPU package is @tensorflow/tfjs-node; the GPU package is @tensorflow/tfjs-node-gpu and requires a CUDA-capable NVIDIA GPU and the matching CUDA toolkit. On Node, performance is comparable to Python TensorFlow because both frontends drive the same C++ kernels.

Capabilities

Training in the browser

TensorFlow.js supports full backpropagation in the browser, which is unusual for a JavaScript ML library. This enables several patterns that would otherwise require a backend server:

Transfer learning on a frozen backbone such as MobileNet, with a small classifier head trained on user-supplied images.
Federated learning, where each client trains locally and only sends gradient updates to a server.
On-device personalisation of recommendation models, gesture detectors, or text predictors using the user's own data without uploading it.

The Layers API exposes the standard model.fit workflow, and the Core API exposes tf.grad and tf.variableGrads for custom training loops.

Inference

Most real applications load a pretrained model and run inference. TensorFlow.js loads models from a model.json file (containing the topology and weight manifest) plus one or more .bin shards (containing the weights). The tf.loadGraphModel function loads converted TensorFlow SavedModels, and tf.loadLayersModel loads converted Keras models or models saved from the TFJS Layers API.

Conversion

The tfjs-converter is a Python command-line tool, installed via pip install tensorflowjs, that converts existing models to the TensorFlow.js web format. Supported source formats include:

Source format	TFJS loader
TensorFlow SavedModel	`tf.loadGraphModel`
Keras HDF5 (`.h5`)	`tf.loadLayersModel`
TensorFlow Hub modules	`tf.loadGraphModel`
TFLite (`.tflite`), via TensorFlow Lite (LiteRT) tooling and tf2onnx	varies
PyTorch (via ONNX export plus tf2onnx)	varies

The converter writes a directory containing model.json and sharded weight files, with shards typically capped at 4 MB each so they cache well in browsers.

Pre-trained models

Google maintains a separate repository, tensorflow/tfjs-models, of pre-trained models published as npm packages under the @tensorflow-models/* scope. The most commonly used models are listed below.

Model	Task	Notes
MoveNet	Single-person pose estimation	17 keypoints; runs at 50+ fps on modern phones; Lightning and Thunder variants
PoseNet	Multi-person pose estimation	Older single- and multi-pose model; superseded by MoveNet for single-pose use
BodyPix	Person and body-part segmentation	24-part segmentation; used for backgrounds and AR effects
HandPose	Hand landmark detection	21 keypoints per hand; predicts handedness
Face Landmarks Detection	Face mesh and attention	Builds on MediaPipe Face Mesh
Face Detection	Face bounding-box detection	Lightweight detector for camera streams
MobileNet	Image classification	ImageNet labels; small enough to fit in a few megabytes
COCO-SSD	Object detection	Single Shot MultiBox Detector trained on COCO
Universal Sentence Encoder	Text embeddings	512-dimensional sentence embeddings for similarity and classification
Speech Commands	Keyword spotting	Recognises a small set of voice commands from microphone audio
KNN Classifier	Nearest-neighbour classifier	Used as a head for transfer learning on top of MobileNet features
Toxicity	Toxic comment classification	Multi-label classifier for offensive language

In addition, Magenta.js (@magenta/music) is a JavaScript port of Google's Magenta research library for music and art generation. It uses TensorFlow.js for GPU-accelerated inference and ships browser-compatible implementations of MusicVAE (variational autoencoder for melodies, drums, and trios), MusicRNN (LSTM-based MelodyRNN, DrumsRNN, ImprovRNN, PerformanceRNN), and GANSynth (GAN-based audio synthesis). Most Magenta.js models can run inside a Web Worker; GANSynth and Onsets and Frames need direct access to the page's AudioContext to synthesise audio.

Common use cases

TensorFlow.js is used for tasks that benefit from running models close to the user. The most common categories are:

Interactive in-browser ML demos, such as Google's Teachable Machine, which lets non-programmers train an image, audio, or pose classifier from webcam samples and export it for use in TensorFlow.js, TensorFlow Lite, or full TensorFlow.
Client-side image and video processing for video calls and creative tools, including background blur, virtual backgrounds, body-part masks, and AR filters layered on top of camera streams.
Edge inference for privacy, where sensitive data such as health images, personal text, or biometric signals never leaves the device. This is closely related to broader edge computing trends in machine learning.
Educational tools like the TensorFlow Playground and Distill articles, which use small TFJS models to illustrate gradient descent, activation functions, and other ML concepts in the browser.
Real-time pose, face, and gesture recognition for AR overlays, fitness apps, and accessibility input. MoveNet is widely used here because it hits 50+ fps on commodity hardware.
Web-based games and creative tools, including procedural music generators and webcam-controlled experiences such as the original PAC-MAN demo.
Hybrid offline-first apps that run inference locally when the network is unavailable and synchronise results once it returns.

Performance characteristics

The practical performance of TensorFlow.js depends heavily on the chosen backend, the model architecture, and the device. Some rough numbers, drawn from the TensorFlow.js team's own benchmarks and blog posts:

WebGL versus CPU: the WebGL backend has been measured at roughly 100x faster than the plain JS CPU backend on typical CNNs. For everyday models such as MobileNet, WebGL is about 5x to 15x faster than the JS backend.
WebGL versus WASM: for medium and large CNNs (MobileNet, BodyPix, PoseNet), WebGL is generally 2x to 4x faster than WASM. For very small models such as MediaPipe BlazeFace and FaceMesh, WASM can match or beat WebGL because it avoids shader-compilation overhead.
WASM versus plain JS: with SIMD and multithreading, the WASM backend is roughly 10x faster than plain JavaScript on common workloads. Benchmarks in the WASM launch post showed MobileNet V2 inference about 3x to 11.5x faster than plain JS in Chrome 79.
WebGPU versus WebGL: the WebGPU backend has been reported to deliver around 3x faster inference than WebGL on machine-learning workloads, with sub-30 ms response times on small and mid-sized models.
Mobile browsers: performance varies widely by device GPU. iOS Safari historically had stricter WebGL limits than desktop Chrome, which sometimes forces fallback to WASM.
Node.js with native bindings: because tfjs-node and tfjs-node-gpu call the same C++ kernels as Python TensorFlow, performance is comparable to native Python TensorFlow on the same hardware.

Memory bandwidth is often the bottleneck for the WebGL backend rather than floating-point throughput, because each op fetches input textures and writes output textures rather than fusing kernels the way a native CUDA implementation would.

Comparison with other browser ML libraries

A number of JavaScript libraries target client-side ML, with different trade-offs in vendor support, model formats, and focus.

Library	Vendor	Year	Backends	Model formats	Focus
TensorFlow.js	Google	2018	WebGL, WebGPU, WASM, CPU, Node native	TFJS, SavedModel via converter, Keras `.h5`	General-purpose ML in browser, Node, and React Native
ONNX Runtime Web	Microsoft	2021	WebGL, WebGPU, WebNN, WASM	ONNX	Cross-framework inference; pairs with PyTorch via ONNX export
Transformers.js	Hugging Face	2023	Built on ONNX Runtime Web (WebGPU, WASM)	ONNX-converted Hugging Face models	NLP and vision pipelines mirroring Python `transformers`
ml5.js	NYU ITP / community	2018	Built on TensorFlow.js	TFJS-compatible models	Friendly API for artists and educators
brain.js	Community	2015	CPU, WebGL via gpu.js	Internal	Simple feed-forward and recurrent networks
ConvNetJS	Andrej Karpathy	2014	CPU only	Internal	Educational; pre-deeplearn.js era
Synaptic	Community	2014	CPU only	Internal	Architecture-free recurrent networks
LiteRT.js	Google	2024	WebGPU, WASM	LiteRT (`.tflite`)	Browser path for the TensorFlow Lite (LiteRT) ecosystem

Transformers.js and ONNX Runtime Web have taken much of the LLM and transformer-inference traffic that might otherwise have gone to TensorFlow.js, in part because most large open-weight models are released first as PyTorch checkpoints and exported to ONNX rather than to the TFJS native format. TensorFlow.js remains the more natural choice for projects that already live in the TensorFlow or Keras ecosystem, for in-browser training, and for the rich tfjs-models pretrained library.

Notable applications

Teachable Machine (teachablemachine.withgoogle.com): Google's no-code in-browser tool for training image, sound, and pose classifiers. The current version supports three input types (images and videos, sounds, body positions). All training happens locally; the tool exports models for use in TensorFlow.js, TensorFlow Lite, or full TensorFlow.
PoseNet and MoveNet demos: Google has released several real-time pose tracking demos using TensorFlow.js, including dance, fitness, and accessibility applications.
Magenta.js music tools: browser-based music sketches such as Latent Loops, Beat Blender, and Tone Transfer use Magenta.js and TFJS to run generative music models without a server.
AR filters and TikTok-style web filters: many web-based filter pipelines run BodyPix or Face Landmarks Detection through TFJS to compose effects on top of camera streams.
In-browser reinforcement learning demos: TensorFlow.js has been used to run small Atari-style RL agents and policy-gradient demos that train in front of the user.
Body language and ergonomics analysis tools: posture-tracking webapps use MoveNet and BodyPix to give feedback during exercise or office work without sending video to a server.

Limitations

TensorFlow.js is constrained by the browser sandbox more than by anything intrinsic to its design.

Memory limits. Browsers cap the amount of memory a tab can use, often around 2 GB on desktop and much less on mobile. WebGL further caps the maximum texture size, which limits the largest activations a model can hold.
Single JavaScript thread. Without Web Workers or the WASM backend's threadpool, all computation runs on the page's main thread, which makes the UI sluggish during heavy training.
GPU access is constrained. WebGL and WebGPU expose less of the underlying GPU than native CUDA does. Features such as fused kernels, persistent kernel launches, and tensor cores are not directly accessible.
First-load model size. Models in the 10 MB to 50 MB range are practical for fast first-load over the network, with weight shards cached by the browser. Multi-hundred-megabyte models load slowly on first visit.
LLM-scale models. TensorFlow.js is not the typical choice for in-browser large language models. Most browser LLM demos use Transformers.js on top of ONNX Runtime Web, with quantised weights and WebGPU compute shaders.
Cross-browser quirks. WebGL behaviour varies between Chrome, Firefox, and Safari; WebGPU support is still uneven across mobile browsers, especially on iOS.
Native TensorFlow gaps. The kernel coverage of the WebGL and WebGPU backends lags behind TensorFlow's native C++ kernels, so some operations from imported SavedModels fall back to the CPU.

Recent developments (2023 to 2026)

WebGPU backend maturation. Chrome 113 shipped WebGPU by default in May 2023, and the TensorFlow.js WebGPU backend has progressed from preview to a serious production option through 2024 and 2025, with kernel coverage and performance closing in on WebGL across common CNN and transformer workloads.
Transformer support. TensorFlow.js has improved support for transformer architectures, but Hugging Face's Transformers.js has emerged as a popular alternative for browser LLM and transformer inference, often built on ONNX Runtime Web instead of TFJS.
LiteRT.js. Google rebranded TensorFlow Lite as LiteRT and introduced LiteRT.js, a Web AI inference path for .tflite models that complements TensorFlow.js for browser deployment.
Continued community work. The tensorflow/tfjs repository has been steadily maintained, with version 4.22.0 released in October 2024 and ongoing kernel and converter improvements.

Integration

TensorFlow.js is framework-agnostic and works with React, Vue, Angular, Svelte, and vanilla JavaScript. Two installation paths are common:

NPM: npm install @tensorflow/tfjs adds the umbrella package, which includes the Core API, Layers API, default browser backends (WebGL, CPU), and the Data and Vis utilities. Specific backends can be installed separately, for example @tensorflow/tfjs-backend-webgpu or @tensorflow/tfjs-backend-wasm.
CDN: loading <script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs"></script> from a CDN gives a window.tf global without a build step, which is convenient for prototyping.

The library is written in TypeScript and ships its own type definitions, so editors can autocomplete tensor shapes and op names. Tree-shaking is supported when importing only the kernels and ops a project uses.

License and governance

TensorFlow.js is released under the Apache 2.0 license, the same license as the rest of the TensorFlow ecosystem. The project is governed by Google with significant external contributions; the source lives at github.com/tensorflow/tfjs. Pre-trained models live in the parallel tensorflow/tfjs-models repository, also under Apache 2.0.

TensorFlow.js

Definition and goals

History

Deeplearn.js (2017)

Launch as TensorFlow.js (2018)

Continued development (2018 to 2024)

Architecture

Core API

Layers API

Backends

WebGL backend

WebGPU backend

WASM backend

CPU backend

Node.js backend

Capabilities

Training in the browser

Inference

Conversion

Pre-trained models

Common use cases

Performance characteristics

Comparison with other browser ML libraries

Notable applications

Limitations

Recent developments (2023 to 2026)

Integration

License and governance

References

Improve this article

Related Articles

TensorFlow Decision Forests (TF-DF)

Web Development ChatGPT Plugins

Bolt.new

Lovable (AI app builder)

v0 (Vercel)

PyTorch Lightning

TensorFlow.js

Definition and goals

History

Deeplearn.js (2017)

Launch as TensorFlow.js (2018)

Continued development (2018 to 2024)

Architecture

Core API

Layers API

Backends

WebGL backend

WebGPU backend

WASM backend

CPU backend

Node.js backend

Capabilities

Training in the browser

Inference

Conversion

Pre-trained models

Common use cases

Performance characteristics

Comparison with other browser ML libraries

Notable applications

Limitations

Recent developments (2023 to 2026)

Integration

License and governance

References

Related Articles

TensorFlow Decision Forests (TF-DF)

Web Development ChatGPT Plugins

Bolt.new

Lovable (AI app builder)

v0 (Vercel)

PyTorch Lightning