# Keras

> Source: https://aiwiki.ai/wiki/keras
> Updated: 2026-06-21
> Categories: AI Tools & Products, Deep Learning, Machine Learning
> From AI Wiki (https://aiwiki.ai), a free encyclopedia of artificial intelligence. Quote with attribution.

Keras is an open-source, high-level [neural network](/wiki/neural_network) API written in [Python](/wiki/python) that lets developers build, train, and deploy [deep learning](/wiki/deep_model) models with minimal code. Created by Francois Chollet and first released on March 27, 2015, Keras is licensed under the Apache 2.0 license and has grown into one of the most widely adopted deep learning libraries in the world: the Keras team reports it has been "chosen by over 2.5M developers" and that it powers production systems including the Waymo self-driving fleet and the YouTube recommendation engine.[^1][^2][^3] Rather than implementing low-level math itself, Keras acts as a frontend for other computational frameworks.

In its current iteration, Keras 3, the library supports [TensorFlow](/wiki/tensorflow), [JAX](/wiki/jax), [PyTorch](/wiki/pytorch), and [OpenVINO](/wiki/openvino) as interchangeable backends, allowing developers to write code once and run it across multiple frameworks without modification.[^3][^4] Keras 3 was released on November 28, 2023, as a full rewrite that restored the library's original multi-backend design.[^18][^20] As of May 2026, the current stable release is version 3.14.1, published May 7, 2026.[^2][^5]

## Infobox

| Field | Value |
|---|---|
| Original author | [Francois Chollet](/wiki/francois_chollet) |
| Initial release | March 27, 2015 |
| Current stable release | 3.14.1 (May 7, 2026)[^5] |
| Repository | github.com/keras-team/keras |
| Written in | [Python](/wiki/python) |
| License | Apache License 2.0 |
| Supported backends (3.x) | [TensorFlow](/wiki/tensorflow), [JAX](/wiki/jax), [PyTorch](/wiki/pytorch), [OpenVINO](/wiki/openvino) |
| Reported users | 2.5 million+ developers (2023)[^3] |
| Minimum Python | 3.11 (from Keras 3.13)[^6] |

## What is Keras used for?

Keras is used to define, train, evaluate, and deploy [neural networks](/wiki/neural_network) for tasks across [computer vision](/wiki/computer_vision), natural language processing, generative AI, speech, time series, and reinforcement learning. It provides ready-made building blocks (layers, optimizers, [loss functions](/wiki/loss_function), metrics, and callbacks) so practitioners can assemble a working model in a few lines instead of implementing forward and backward passes by hand. Beginners can train a classifier in fewer than ten lines using the high-level `Sequential`, `compile`, and `fit` workflow, while researchers can drop down to custom training loops or native backend code for full control.[^4][^32] Because Keras 3 runs on [JAX](/wiki/jax), [TensorFlow](/wiki/tensorflow), [PyTorch](/wiki/pytorch), and [OpenVINO](/wiki/openvino), the same model can be trained on one framework and deployed on another, which is a common pattern for moving from research to production or to edge devices.[^3][^4]

## History and evolution

The development of Keras is closely tied to the career of its creator, Francois Chollet, a French software engineer and AI researcher.

### When was Keras created?

Chollet began work on what became Keras while researching recurrent neural networks ([RNNs](/wiki/rnn)). At the time he found no good reusable open-source implementation of RNNs and [LSTMs](/wiki/lstm) for the Python data-science ecosystem. The available options had clear limitations: Caffe was popular in computer vision but only worked for narrow use cases and was not very extensible, while Torch 7 required coding in Lua, which lacked the advantages of the Python ecosystem around NumPy, SciPy, and [scikit-learn](/wiki/scikit_learn).[^7][^8] Chollet decided to build his own library, and that effort became Keras.

According to Chollet and Wikipedia's project history, Keras was developed as part of the research effort of project ONEIROS (Open-ended Neuro-Electronic Intelligent Robot Operating System), a research initiative on which Chollet was working in early 2015.[^2][^9] The first commit was published publicly on [GitHub](https://github.com/keras-team/keras) on March 27, 2015.[^2][^7] When Keras launched, one of its first differentiators was that it was the first Python deep learning library to offer support for both recurrent and convolutional networks in a single API.[^7]

The name "Keras" comes from the Ancient Greek word *keras* (kappa-epsilon-rho-alpha-sigma, meaning "horn"), a reference to the literary image of the "Gate of Horn" from Homer's *Odyssey*, through which true visions pass to mortals.[^2] Chollet released the first version in March 2015 and joined Google shortly afterward.[^10]

### Keras 1 and 2: the multi-backend era (2015 to 2019)

In its early versions, Keras supported multiple backends. Keras was originally built on top of Theano, the University of Montreal symbolic-math library, and added [TensorFlow](/wiki/tensorflow) as a backend in 2016. Microsoft's Cognitive Toolkit (CNTK) and Intel's PlaidML (which targeted non-NVIDIA GPUs via OpenCL) were also supported in various 2.x releases.[^11][^12] This backend-agnostic design was one of Keras's defining features: users could write model code once and switch between backends by changing a single configuration setting in `~/.keras/keras.json`.[^13]

Keras 2, released in March 2017, stabilized the API and brought improvements to the [layer](/wiki/layer) system, model saving, and preprocessing utilities. Through this period, Keras's user base grew rapidly, and it became one of the most popular deep learning tools on [Kaggle](/wiki/kaggle), in industry, and in university courses.[^7] Keras 2.3.0, released in September 2019, was the first version of multi-backend Keras with full TensorFlow 2 support and was announced as the last major release of the multi-backend line.[^14]

### tf.keras: integration into TensorFlow (2019 to 2023)

When TensorFlow 2.0 launched in September 2019, Keras was integrated as TensorFlow's official high-level API under the `tf.keras` namespace.[^15][^16] Earlier, in December 2018, the [TensorFlow](/wiki/tensorflow) team had already announced that they were standardizing on Keras and would deprecate or remove competing high-level APIs (such as `tf.estimator` and `tf.slim`).[^17] This integration gave Keras access to TensorFlow's full ecosystem, including [TensorFlow Serving](/wiki/tensorflow_serving), [TensorFlow Lite](/wiki/tensorflow_lite), and [TensorFlow.js](/wiki/tensorflow_js) for deployment across servers, mobile devices, and web browsers.[^15][^16]

During this period (Keras 2.4 through 2.15), TensorFlow was the only supported backend. The standalone multi-backend Keras package was no longer maintained in favor of `tf.keras`, and the team explicitly recommended that users switch.[^11][^14] The integration also added TensorFlow-specific features such as eager execution by default, [TPU](/wiki/tpu) training, native support for distributed training via `tf.distribute.Strategy`, and the SavedModel format.[^15][^16]

### Keras 3: return to multi-backend (2023 to present)

In 2023, Chollet announced Keras 3, a full rewrite of the library that restored multi-backend support. The project was developed under the codename "Keras Core" during its initial development phase (April to July 2023) and a public beta test (July to September 2023). In September 2023, the project repository at `keras-team/keras-core` was renamed back to `keras-team/keras`, and the official Keras 3.0 release shipped on November 28, 2023.[^18][^19][^20] Announcing the release, Chollet wrote: "Keras 3 is a full rewrite of Keras that enables you to run your Keras workflows on top of either JAX, TensorFlow, PyTorch, or OpenVINO (for inference-only)."[^4]

Keras 3 supports four backends:

| Backend | Use case | Notes |
|---|---|---|
| [JAX](/wiki/jax) | High-performance training and inference | Typically delivers the best performance on GPU, TPU, and CPU for many architectures[^21] |
| [TensorFlow](/wiki/tensorflow) | Production deployment, mobile/web | Access to TF Serving, TF Lite, TF.js ecosystem and `tf.distribute`[^4] |
| [PyTorch](/wiki/pytorch) | Research, integration with PyTorch ecosystem | Keras layers function as native PyTorch Modules; full support including `DistributedDataParallel`[^4] |
| [OpenVINO](/wiki/openvino) | Inference-only optimization | Added in Keras 3.8 (January 2025) for accelerated CPU/iGPU/NPU inference[^22] |

The OpenVINO backend, contributed in collaboration with Intel, supports inference but not training, because OpenVINO does not implement gradient computation; users typically train with JAX, TensorFlow, or PyTorch and switch to OpenVINO for deployment.[^22]

As of May 2026, the latest stable version is Keras 3.14.1, published May 7, 2026, following 3.14.0 on April 3, 2026.[^5] Starting with version 3.13.0, Keras requires Python 3.11 or higher.[^6] TensorFlow 2.16 and later versions ship with Keras 3 as the default Keras implementation, while the legacy Keras 2 remains available through the `tf_keras` maintenance package (installed via `pip install tf_keras` and selected by setting the environment variable `TF_USE_LEGACY_KERAS=1` before importing TensorFlow).[^23]

## Who created Keras? (Francois Chollet)

[Francois Chollet](/wiki/francois_chollet) earned a Master of Engineering degree from ENSTA Paris (part of the Polytechnic Institute of Paris) in 2012. He created Keras in 2015 and joined [Google](/wiki/google) the same year, where he served as a Senior Staff Engineer for over nine years before departing in November 2024.[^10][^24] Announcing his departure, Chollet wrote: "I'm leaving Google to go start a new company with a friend ... I will stay deeply involved with the Keras project from the outside."[^24]

Beyond Keras, Chollet has made several contributions to the AI field:

- He authored *Deep Learning with Python* (Manning Publications, 2017; second edition 2021; third edition co-authored with Matthew Watson, 2025), which the publisher reports has sold more than 100,000 copies, and co-authored *Deep Learning with R*.[^25][^26]
- He published the Xception architecture paper (*Xception: Deep Learning with Depthwise Separable Convolutions*) at CVPR 2017, which ranks among the most cited CVPR papers, with more than 18,000 citations on Google Scholar as of 2024.[^27]
- In 2019, he created the Abstraction and Reasoning Corpus ([ARC-AGI](/wiki/arc_agi)), a benchmark designed to measure AI systems' ability to solve novel reasoning problems, introduced in his paper *On the Measure of Intelligence*.[^28]
- In June 2024, he co-launched the ARC Prize, a $1 million+ competition to solve the ARC-AGI benchmark, alongside Mike Knoop (co-founder of Zapier).[^29][^30]
- After leaving Google in November 2024, Chollet co-founded Ndea, a research lab focused on program synthesis for artificial general intelligence, with Mike Knoop.[^24][^31]

Although Chollet left Google, Google continued sponsoring Keras development, and Chollet stated he would stay involved with the project from outside Google.[^24]

## Design philosophy

Keras was designed around a small number of guiding principles that Chollet has articulated repeatedly in interviews and in the project documentation: user-friendliness, modularity, ease of extensibility, and what he calls "progressive disclosure of complexity."[^32][^33]

### What is progressive disclosure of complexity?

The Keras team describes this idea as the core of the API: "Progressive disclosure of complexity is the design principle at the heart of the Keras API."[^4] In practice it means simple workflows should be quick and easy, while arbitrarily advanced workflows should remain possible via a clear path that builds on what users already know. Beginners can train a model in fewer than ten lines using high-level APIs (`Sequential`, `compile`, `fit`), while advanced users can override `train_step`, write fully custom training loops, or even drop down to native backend code, all using the same components.[^4][^32]

### Modularity and composability

A Keras model is a graph of standalone, configurable modules. Layers, [loss functions](/wiki/loss_function), metrics, optimizers, weight initializers, regularizers, and callbacks are independent objects that can be combined, swapped, or subclassed.[^4] The Functional API in particular treats layers as callable objects on tensors, which makes shared layers, multi-input/multi-output models, and skip connections straightforward.[^34]

### Backend agnosticism

Keras 3 returns to the original multi-backend philosophy and goes further than the 2.x version by exposing a unified operations namespace (`keras.ops`) that lets users write custom components once and run them on any backend (see [The keras.ops namespace](#cross-backend-code-with-keras-ops) below).[^4]

## Model-building APIs

Keras offers three distinct APIs for building [neural network](/wiki/neural_network) models, each suited to different levels of complexity and customization.[^34]

### Sequential API

The Sequential API is the simplest way to build a model in Keras. It allows users to create models by stacking layers in a linear sequence, one after another. This API is ideal for straightforward architectures where data flows through each [layer](/wiki/layer) in order without branching or merging.[^34]

```python
import keras
from keras import layers

model = keras.Sequential([
    layers.Input(shape=(784,)),
    layers.Dense(128, activation='relu'),
    layers.Dropout(0.3),
    layers.Dense(64, activation='relu'),
    layers.Dense(10, activation='softmax')
])
```

The Sequential API is best suited for beginners or for building simple models like basic classifiers and regressors. Its limitation is that it only supports single-input, single-output stacks of layers.[^34]

### Functional API

The Functional API provides greater flexibility by allowing users to define models as directed acyclic graphs of layers. This API supports multiple inputs and outputs, shared layers, and non-linear topologies such as skip connections and residual blocks.[^34][^35]

```python
inputs = keras.Input(shape=(784,))
x = layers.Dense(128, activation='relu')(inputs)
x = layers.Dropout(0.3)(x)
x = layers.Dense(64, activation='relu')(x)
outputs = layers.Dense(10, activation='softmax')(x)

model = keras.Model(inputs=inputs, outputs=outputs)
```

The Functional API strikes a balance between ease of use and flexibility. It is the recommended approach for most use cases, including architectures with branching (such as Inception-style networks) and models that require multiple input or output tensors.[^34][^35]

### Model subclassing API

The Subclassing API gives users full control over the model by defining a custom class that inherits from `keras.Model`. Users implement the `__init__` method to define layers and the `call` method to specify the forward pass logic.[^34]

```python
class MyModel(keras.Model):
    def __init__(self):
        super().__init__()
        self.dense1 = layers.Dense(128, activation='relu')
        self.dropout = layers.Dropout(0.3)
        self.dense2 = layers.Dense(64, activation='relu')
        self.out = layers.Dense(10, activation='softmax')

    def call(self, inputs):
        x = self.dense1(inputs)
        x = self.dropout(x)
        x = self.dense2(x)
        return self.out(x)
```

This approach is suited for advanced research or highly customized models that require conditional logic, loops, or other dynamic behaviors during the forward pass. It offers maximum flexibility but requires a deeper understanding of the framework.[^34][^35]

### API comparison

| Feature | Sequential | Functional | Subclassing |
|---|---|---|---|
| Ease of use | Very easy | Moderate | Advanced |
| Multiple inputs/outputs | No | Yes | Yes |
| Shared layers | No | Yes | Yes |
| Non-linear topology | No | Yes | Yes |
| Dynamic forward pass | No | No | Yes |
| Model visualization (`plot_model`) | Yes | Yes | Limited |
| Best for | Beginners, simple models | Most use cases | Research, custom architectures |

The three APIs are not mutually exclusive: a Functional model can include a subclassed `Layer`, and a subclassed `Model` can use the Functional API internally. Mixing and matching is encouraged when it improves clarity.[^34]

## Key features of Keras 3

### Cross-backend code with keras.ops

Keras 3 introduced the `keras.ops` namespace, which provides a unified set of operations that work identically across all backends. This includes a [NumPy](https://numpy.org/)-compatible API (for example, `ops.matmul`, `ops.sum`, `ops.stack`, `ops.einsum`) and neural-network-specific functions (for example, `ops.softmax`, `ops.binary_crossentropy`, `ops.conv`).[^4][^36] Any custom layer, loss, metric, or optimizer written with `keras.ops` runs on JAX, TensorFlow, PyTorch, and (for inference) OpenVINO. Internally, calls to `keras.ops` dispatch to the equivalent operation in the active backend, while preserving the same input/output semantics. According to the Keras 3 announcement, numerical results match to within `1e-7` precision in `float32` across backends.[^4]

### Stateless functional API

All stateful objects in Keras 3 (layers, models, optimizers, metrics) expose a parallel stateless API for use in pure-functional contexts, particularly JAX. Layers and models have a `stateless_call()` method that mirrors `__call__`, optimizers have `stateless_apply()` mirroring `apply()`, and metrics have `stateless_update_state()` and `stateless_result()`.[^4][^37] This makes Keras components usable inside `jax.grad`, `jax.jit`, and `jax.pmap` without further wrapping.[^37]

### Cross-framework data pipelines

`model.fit()`, `model.evaluate()`, and `model.predict()` accept input data in many formats regardless of the active backend, including NumPy arrays, [pandas](https://pandas.pydata.org/) DataFrames, `tf.data.Dataset`, `torch.utils.data.DataLoader`, and `keras.utils.PyDataset` (a parallelizable Python generator).[^4][^38] A model running on the JAX backend can iterate over a PyTorch `DataLoader`, and a model on the PyTorch backend can consume a `tf.data.Dataset`.[^4]

### Distribution API

Keras 3 includes a distribution API (`keras.distribution`) that simplifies data parallelism and model parallelism.[^4][^39] Initially implemented on the JAX backend (with TensorFlow and PyTorch implementations rolling out across the 3.x line), it lets users distribute training across many GPUs or [TPUs](/wiki/tpu) with a few lines of code. The core abstractions are `DeviceMesh` (a logical grid of accelerators, analogous to `jax.sharding.Mesh`) and `TensorLayout` (which describes how a tensor is sharded across the mesh).[^39]

For pure data parallelism, two lines suffice:

```python
distribution = keras.distribution.DataParallel(
    devices=keras.distribution.list_devices()
)
keras.distribution.set_distribution(distribution)
```

For model parallelism, users define a `LayoutMap` that matches variable names with regular expressions and assigns each match a `TensorLayout`. The underlying framework distributes the program and tensors according to the sharding directives through single-program multiple-data (SPMD) expansion.[^39] The API keeps model definition, training logic, and sharding configuration separate, so a model can be scaled up by editing only the layout map.

### Pre-trained model gallery (Keras Applications)

Keras provides access to more than 40 pre-trained model architectures through Keras Applications, including [ResNet](/wiki/resnet) and ResNetV2, VGG16/VGG19, [Inception](/wiki/inception)V3 and InceptionResNetV2, Xception, MobileNet (v1/v2/v3), EfficientNet (B0 to B7), EfficientNetV2, NASNet, DenseNet, and ConvNeXt.[^40] These models come with pre-trained weights (typically trained on [ImageNet](/wiki/imagenet) with 1,000 classes) and can be used for prediction, feature extraction, fine-tuning, and [transfer learning](/wiki/transfer_learning).[^40]

### XLA compilation

For the JAX and TensorFlow backends, models support XLA ([Accelerated Linear Algebra](/wiki/xla)) compilation. `model.compile(..., jit_compile="auto")` is the default and enables XLA where possible. XLA fuses operations into optimized kernels for the target hardware (CPU/GPU/TPU) and is one of the main reasons JAX often achieves the best training throughput on Keras 3 benchmarks.[^4][^21]

## Common layers and modules

Keras organizes its [layer](/wiki/layer) library into 16 categories. The table below lists the most commonly used layers.[^41]

| Layer | Category | Description |
|---|---|---|
| `Dense` | Core | Fully connected layer; each neuron connects to every neuron in the previous layer |
| `Conv2D` | Convolution | 2D convolution layer for processing image data |
| `LSTM` | Recurrent | Long Short-Term Memory layer for sequential data; handles the vanishing gradient problem |
| `GRU` | Recurrent | Gated Recurrent Unit; a simpler alternative to LSTM with comparable performance |
| `Embedding` | Core | Maps integer indices (e.g., word IDs) to dense vectors; used in NLP models |
| `Dropout` | Regularization | Randomly sets a fraction of input units to zero during [training](/wiki/training) to prevent overfitting |
| `BatchNormalization` | Normalization | Normalizes layer inputs to have zero mean and unit variance, stabilizing training |
| `LayerNormalization` | Normalization | Normalizes across features rather than the batch dimension; common in transformers |
| `MultiHeadAttention` | Attention | Implements the multi-head [attention](/wiki/attention) mechanism used in [transformer](/wiki/transformer) architectures |
| `Flatten` | Reshaping | Flattens a multi-dimensional input into a 1D vector |
| `MaxPooling2D` | Pooling | Downsamples spatial dimensions by taking the maximum value in each pooling window |
| `Concatenate` | Merging | Concatenates a list of inputs along a specified axis |

In addition to these, Keras provides preprocessing layers for text, image, and audio data; activation layers ([ReLU](https://keras.io/api/layers/activations/), [Softmax](/wiki/softmax), GELU, Swish); weight initializers (GlorotNormal, HeNormal); weight regularizers (L1, L2); and backend-specific layers (`TorchModuleWrapper`, `JaxLayer`, `FlaxLayer`) for interoperability with native PyTorch Modules, TensorFlow SavedModels, and JAX/Flax layers.[^4][^41]

## Training workflow

Keras provides a streamlined workflow for [training](/wiki/training) and evaluating models, centered around three methods: `compile`, `fit`, and `evaluate`/`predict`.[^42]

### Step 1: compile the model

Before training, the model must be compiled with an optimizer, a [loss function](/wiki/loss_function), and (optionally) metrics to monitor.[^42]

```python
model.compile(
    optimizer='adam',
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
)
```

Optimizers include SGD, RMSprop, [Adam](/wiki/adam_optimizer), [AdamW](/wiki/adamw), Adagrad, Adadelta, Adamax, Nadam, Ftrl, and Lion. Losses cover regression (MeanSquaredError, MeanAbsoluteError, Huber), classification (BinaryCrossentropy, CategoricalCrossentropy, SparseCategoricalCrossentropy, Focal variants), and probabilistic objectives (KLDivergence, Poisson).[^4]

### Step 2: train with model.fit()

The `model.fit()` method is the primary training function. For each epoch, it performs the following steps:[^42]

1. Splits the data into batches.
2. Performs a forward pass to generate predictions.
3. Calculates the loss between predictions and true labels.
4. Computes gradients of the loss with respect to the model's trainable weights.
5. Updates the weights using the optimizer.
6. Calculates the specified metrics.
7. Optionally evaluates on a separate validation dataset at the end of each epoch.

```python
history = model.fit(
    x_train, y_train,
    epochs=20,
    batch_size=32,
    validation_split=0.2,
    callbacks=[...]
)
```

Keras 3 supports multiple data pipeline formats, including NumPy arrays, `tf.data.Dataset`, `torch.utils.data.DataLoader`, Pandas DataFrames, and `keras.utils.PyDataset`, regardless of which backend is active.[^4][^38]

### Step 3: evaluate and predict

`model.evaluate()` calculates the loss and metrics on a test dataset, providing a measure of the model's performance on unseen data.[^42]

```python
test_loss, test_accuracy = model.evaluate(x_test, y_test)
```

`model.predict()` generates output predictions for new input data without computing loss or metrics.

```python
predictions = model.predict(new_data)
```

### Custom training loops

For more advanced use cases, users can override the `train_step()` method to customize the training logic while still using `model.fit()`.[^4] This is the recommended path for [GAN](/wiki/gan) training, self-supervised learning, contrastive losses, and similar setups. Alternatively, Keras components can be used inside fully custom training loops written in native JAX (using `jax.grad`, `jax.jit`, and the stateless API), TensorFlow (using `tf.GradientTape`), or PyTorch (using `torch.autograd` and `optimizer.step()`).[^4]

## Callbacks system

Callbacks are objects passed to `model.fit()` that can perform actions at various stages of training, such as at the start or end of an epoch, or before or after processing a batch. Keras includes several built-in callbacks for common tasks.[^43]

| Callback | Purpose |
|---|---|
| `EarlyStopping` | Stops training when a monitored metric (e.g., validation loss) has stopped improving for a specified number of epochs (`patience`). Can restore weights from the best epoch via `restore_best_weights`.[^44] |
| `ModelCheckpoint` | Saves the model or its weights periodically or whenever performance on a monitored metric improves; supports both per-epoch and per-batch save frequencies.[^45] |
| `ReduceLROnPlateau` | Reduces the learning rate when a monitored metric has stopped improving, helping the model escape plateaus. |
| `TensorBoard` | Logs training metrics, model graphs, histograms, and embeddings for visualization in TensorBoard.[^46] |
| `LearningRateScheduler` | Adjusts the learning rate according to a user-defined schedule function at each epoch. |
| `CSVLogger` | Streams epoch results (loss, metrics) to a CSV file. |
| `ProgbarLogger` | Displays a progress bar during training. |
| `BackupAndRestore` | Periodically saves training state so a job can resume after preemption. |
| `RemoteMonitor` | Streams events to a remote HTTP endpoint. |

Users can also create custom callbacks by subclassing `keras.callbacks.Callback` and overriding methods like `on_epoch_end`, `on_batch_begin`, `on_train_end`, `on_test_batch_end`, and `on_predict_begin`.[^43]

## Keras ecosystem

### KerasHub (formerly KerasCV and KerasNLP)

Originally, the Keras ecosystem included two separate domain-specific libraries: KerasCV for computer vision and KerasNLP for natural language processing. As AI models increasingly became multimodal (for example, chat-based large language models with image inputs, or vision tasks that leverage text encoders), maintaining separate domain libraries became impractical.[^47]

In October 2024, KerasCV and KerasNLP were consolidated into a single unified library called **KerasHub**.[^47][^48] KerasHub is a pretrained-modeling library that provides Keras 3 implementations of popular model architectures paired with pretrained checkpoints available on [Kaggle](/wiki/kaggle) Models and the Hugging Face Hub. Models work across the JAX, TensorFlow, and PyTorch backends for both training and inference.[^47][^49]

KerasHub launched with 37 pretrained models, including:[^47]

- **Language models:** [Llama 3](/wiki/llama_3), [Gemma](/wiki/gemma), [BERT](/wiki/bert), T5, GPT-2, OPT, Mistral, Mixtral, BART, BLOOM, DeBERTa, DistilBERT, RoBERTa, XLM-RoBERTa, ALBERT.
- **Vision models:** [Stable Diffusion](/wiki/stable_diffusion), Segment Anything (SAM), [YOLOv8](/wiki/yolo), ResNet, EfficientNet.
- **Audio models:** [Whisper](/wiki/whisper).

Features include [LoRA](/wiki/lora) and QLoRA fine-tuning for resource-efficient model adaptation, weight quantization (int8 and int4), model publishing to Kaggle and Hugging Face, and large-scale model-parallel retraining.[^47][^49]

Existing code using `keras_nlp` imports continues to work; migration only requires updating import statements from `keras_nlp` to `keras_hub`. The `keras-nlp` GitHub repository was renamed `keras-hub` while preserving backward compatibility.[^50]

### KerasTuner

KerasTuner is a hyperparameter optimization framework for Keras that automates the search for optimal hyperparameter configurations. It exposes a define-by-run search space (using `hp.Int`, `hp.Float`, `hp.Choice`, `hp.Boolean`) and includes three built-in search algorithms: random search, Bayesian optimization, and Hyperband.[^51] Researchers can also implement custom tuners by subclassing `keras_tuner.engine.tuner.Tuner`.[^51]

### AutoKeras

AutoKeras is an automated machine learning ([AutoML](/wiki/automl)) library built on Keras. Developed by the DATA Lab at Texas A&M University and first released in November 2017, AutoKeras automatically searches for the best model architecture and hyperparameters for a given dataset, with task APIs for image classification, image regression, text classification, structured data tasks, and time-series forecasting.[^52][^53] AutoKeras was published as a journal paper in JMLR in 2023.[^52]

### Distribution API

The distribution API (`keras.distribution`), described above under [Key features of Keras 3](#key-features-of-keras-3), is part of the core Keras package and provides data and model parallelism with minimal code changes. It was first available on the JAX backend in the Keras 3.0 release and is being extended to other backends in subsequent 3.x versions.[^4][^39]

## Performance benchmarks

### How fast is Keras 3 across backends?

The Keras team publishes a benchmark page comparing Keras 3 across backends and against Keras 2 (`tf.keras`). On an NVIDIA A100 (40 GB) GPU, the team reported the following representative results in late 2023:[^21]

| Model and task | Keras 3 + TensorFlow | Keras 3 + JAX | Keras 3 + PyTorch |
|---|---|---|---|
| SegmentAnything inference (ms/step) | 438.50 | **376.34** | 1720.96 |
| Stable Diffusion fit (ms/step) | 392.24 | **391.21** | 823.44 |
| BERT fit (ms/step) | **214.49** | 222.37 | 808.68 |
| BERT predict (ms/step) | 466.01 | **418.72** | 1865.98 |
| Gemma fit (ms/step) | **232.52** | 273.67 | 525.15 |
| Mistral fit (ms/step) | **185.92** | 213.22 | 452.12 |

Lower values are faster; the best result in each row is bold. The Keras team noted that no single backend is best for every workload: JAX often wins on inference (especially on encoder/decoder vision models) and TensorFlow tends to win on large LLM fine-tuning where its compiler is well tuned, while PyTorch through Keras was slower in this set largely because the team had not yet enabled `torch.compile` integration at the time of testing.[^21]

### How much faster is Keras 3 than Keras 2?

The Keras benchmarks page also reports substantial improvements over Keras 2 (`tf.keras`) on the same hardware: SegmentAnything inference roughly 380% faster, Stable Diffusion training throughput more than 150% higher, and BERT training throughput more than 100% higher. The team notes that simply upgrading to Keras 3 while continuing to use the TensorFlow backend still yields a performance boost.[^4][^21]

## Applications

### Who uses Keras?

Keras is used across a wide range of [deep learning](/wiki/deep_model) applications:

| Domain | Examples | Common layers / models |
|---|---|---|
| Image classification | Object detection, face recognition, medical imaging | `Conv2D`, [ResNet](/wiki/resnet), EfficientNet |
| Natural language processing | Text classification, sentiment analysis, machine translation | `Embedding`, [LSTM](/wiki/lstm), [Transformer](/wiki/transformer), [BERT](/wiki/bert) |
| Generative AI | Image synthesis, text generation, data augmentation | [GANs](/wiki/gan), VAEs, [Stable Diffusion](/wiki/stable_diffusion) |
| Speech and audio | Speech recognition, audio classification | `Conv1D`, [Whisper](/wiki/whisper) |
| Time series | Forecasting, anomaly detection | LSTM, GRU, Conv1D |
| Reinforcement learning | Game playing, robot control | Dense, custom training loops |

The Keras 3 announcement and documentation list several large production systems that use Keras: the team states the framework "powers some of the most sophisticated, largest-scale ML systems in the world, such as the Waymo self-driving fleet and the YouTube recommendation engine."[^4] In published case studies and conference talks, companies such as Netflix, Uber, Square, Yelp, Instacart, and Zocdoc have also described using Keras for various deep-learning workloads.[^54] Keras is widely used on [Kaggle](/wiki/kaggle), where it is a default deep-learning framework option in many notebooks and competitions.[^7]

## Limitations and criticisms

While Keras is consistently praised for usability and rapid prototyping, several documented criticisms have followed the library through its history.

### Abstraction overhead and debugging

The same high-level abstractions that make Keras easy can obscure low-level computational details, which complicates debugging when something fails deep in the computation graph. Tracebacks from inside compiled or graph-mode code can be harder to interpret than equivalent PyTorch errors, and printing arbitrary intermediate tensors is less ergonomic than in native NumPy/PyTorch code.[^55][^56] Empirical research has identified concrete bug-localization gaps: a 2021 study of deep neural network faults found that built-in Keras debugging utilities detected most failures but localized very few of them at the layer level, compared with specialized fault-localization tools.[^57]

### Customization limits at the abstraction boundary

For training procedures that require dynamic gradient updates, multiple optimizers, custom backpropagation, or non-standard data flow (for example, advanced [GAN](/wiki/gan) training schedules or some reinforcement-learning algorithms), users must drop down to overriding `train_step` or writing fully custom training loops. While these escape hatches exist, critics argue the cliff between the easy path and the custom path is steep, and that the API can feel constraining for cutting-edge research compared with PyTorch's "everything is just Python" model.[^55][^58] This has reinforced the perception, particularly in academic research circles, that Keras is best suited for applied work rather than novel methods.

### Research adoption lag

Industry adoption studies have consistently shown that since 2018 to 2019, PyTorch has overtaken TensorFlow/Keras as the preferred framework in academic research, especially among papers at major conferences such as NeurIPS, ICML, and CVPR.[^58] Keras 3 partially addresses this gap by allowing users to keep the Keras API while using JAX or PyTorch underneath, but the research community's familiarity with native PyTorch idioms has remained a structural advantage for PyTorch.

### Discontinued backends

Several backends supported in earlier Keras versions have been discontinued or fallen into disuse:[^11][^12]
- Theano was retired in 2017 after the University of Montreal ended maintenance.
- Microsoft Cognitive Toolkit (CNTK) is no longer actively developed.
- PlaidML, briefly used as a Keras backend for AMD/Intel GPUs, is no longer maintained as a Keras integration.
- The classic standalone multi-backend Keras 2.x line stopped receiving updates after 2020 once development moved to `tf.keras`.

## How does Keras differ from PyTorch?

Keras and [PyTorch](/wiki/pytorch) are two of the most widely used frameworks for [deep learning](/wiki/deep_model), but they take different approaches.[^58][^59]

| Aspect | Keras | PyTorch |
|---|---|---|
| Abstraction level | High-level API | Lower-level framework |
| Ease of use | Beginner-friendly; minimal boilerplate | Requires more code but feels Pythonic |
| Debugging | Relies on backend tools; can be less transparent | Standard Python debugging; eager by default |
| Training loop | Built-in `model.fit()` handles most cases | Manual training loops offer full control |
| Research adoption | Common in applied ML and industry | Dominant in academic research[^58] |
| Cutting-edge models | Available through KerasHub | Most new state-of-the-art models appear first in PyTorch |
| Deployment | Strong TensorFlow ecosystem (TF Serving, TF Lite, TF.js) and ONNX export | TorchServe, TorchScript, ONNX export |
| Backend flexibility | Multi-backend (JAX, TF, PyTorch, OpenVINO) | PyTorch only |
| Performance | Can leverage JAX or XLA for best GPU/TPU performance | Optimized for GPU; mature CUDA support |

**When to use Keras:** Keras is well suited to rapid prototyping, educational use, and small to mid-scale production projects. It is a strong choice when backend flexibility is important or when deploying to mobile and edge devices through TensorFlow Lite or OpenVINO. Teams that want JAX's performance benefits without learning JAX's functional programming model can use Keras as a familiar interface.[^4][^21]

**When to use PyTorch:** PyTorch is preferred for cutting-edge research, when fine-grained control over training dynamics is needed, or when working with models that are primarily published in the PyTorch ecosystem. It is also the standard in most academic labs.[^58]

**Hybrid approach:** Many teams use both frameworks. Keras 3's multi-backend support means that a model written in Keras can run on PyTorch as its backend, and Keras layers can be embedded inside native `torch.nn.Module` definitions, bridging the gap between the two ecosystems.[^4]

## Comparison with other high-level APIs

Keras is not the only high-level wrapper for deep learning. The two most often compared alternatives are PyTorch Lightning and fastai.

| Aspect | Keras | [PyTorch Lightning](/wiki/pytorch_lightning) | fastai |
|---|---|---|---|
| Underlying framework | JAX, TensorFlow, PyTorch, OpenVINO | PyTorch | PyTorch |
| Initial release | 2015 | 2019 | 2018 |
| Created by | Francois Chollet | William Falcon (PyTorch Lightning Inc., now [Lightning AI](/wiki/lightning_ai)) | Jeremy Howard and Rachel Thomas |
| Primary design goal | Progressive disclosure, multi-backend portability | Code organization and scalability for research | Maximally fast results with sensible defaults |
| Training abstraction | `model.fit()` + callbacks | `LightningModule` + `Trainer` | `Learner` + callbacks |
| Hyperparameter tuning | KerasTuner | Native sweeps, optional Optuna/Ray | Built-in learning-rate finder |
| Pretrained model library | KerasHub | TorchVision/TorchHub/torchaudio | fastai pretrained models, plus PyTorch ecosystem |
| Strength | Multi-backend portability, simple `fit` API | Reduces boilerplate while staying close to PyTorch | High-level "best practice by default" for tabular/CV/NLP |
| Weakness | Customization can require dropping to native backend | PyTorch only; smaller pretrained library than Keras | Opinionated abstractions can be hard to escape |

PyTorch Lightning is often described as "Keras for PyTorch": a framework that wraps PyTorch with a structured training loop and callback system, similar in spirit to `model.fit`, but without backend flexibility.[^60] fastai, developed by Jeremy Howard and Rachel Thomas at Fast.AI, is a layered API on top of PyTorch that emphasizes "best-practice defaults" and is widely used in the eponymous deep-learning courses; it is more opinionated than Keras and can require learning fastai-specific abstractions like `DataBunch`/`DataBlock`.[^61]

## Explain like I'm 5 (ELI5)

Imagine you want to build something out of LEGO blocks. You could try to make every single tiny brick yourself from scratch, which would take forever. Or you could use a LEGO kit that already has all the special pieces sorted and labeled, with instructions showing you how to snap them together.

Keras is like that LEGO kit, but for building smart computer programs. Deep learning programs are made of building blocks called "layers" that each do a small job, like looking at pictures or reading words. Keras gives you all these building blocks pre-made, so you just pick the ones you need and snap them together in the right order.

Once you put your blocks together, you "train" your creation by showing it lots of examples (like thousands of pictures of cats and dogs) so it learns to tell them apart. Keras handles all the complicated math behind the scenes. You just say "learn from these examples" and it does the rest.

The best part is that Keras works with several different "engines" underneath (called backends), so it is like having one set of LEGO instructions that works with different brands of building blocks.

## See also

- [TensorFlow](/wiki/tensorflow)
- [PyTorch](/wiki/pytorch)
- [JAX](/wiki/jax)
- [OpenVINO](/wiki/openvino)
- [PyTorch Lightning](/wiki/pytorch_lightning)
- [scikit-learn](/wiki/scikit_learn)
- [Neural network](/wiki/neural_network)
- [Convolutional neural network](/wiki/convolutional_neural_network)
- [Recurrent neural network](/wiki/recurrent_neural_network)
- [LSTM](/wiki/lstm)
- [Transformer](/wiki/transformer)
- [BERT](/wiki/bert)
- [Stable Diffusion](/wiki/stable_diffusion)
- [Transfer learning](/wiki/transfer_learning)
- [AutoML](/wiki/automl)
- [XLA](/wiki/xla)
- [TPU](/wiki/tpu)
- [ImageNet](/wiki/imagenet)
- [Adam optimizer](/wiki/adam_optimizer)
- [Francois Chollet](/wiki/francois_chollet)
- [ARC-AGI](/wiki/arc_agi)
- [Deep learning](/wiki/deep_model)
- [TensorFlow Serving](/wiki/tensorflow_serving)
- [TensorFlow Lite](/wiki/tensorflow_lite)
- [TensorFlow.js](/wiki/tensorflow_js)

## References

[^1]: Chollet, Francois. "Keras: Deep Learning for humans." keras.io. https://keras.io/. Accessed 2026-05-24.
[^2]: "Keras." Wikipedia. https://en.wikipedia.org/wiki/Keras. Accessed 2026-05-24.
[^3]: Chollet, Francois. "Introducing Keras 3.0." keras.io, 2023-11-28. https://keras.io/keras_3/. Accessed 2026-05-24.
[^4]: Keras Team. "Introducing Keras 3.0 (announcement page)." keras.io, 2023-11-28. https://keras.io/keras_3/. Accessed 2026-05-24.
[^5]: Keras Team. "Releases (v3.14.1, May 7, 2026; v3.14.0, April 3, 2026)." GitHub, keras-team/keras. https://github.com/keras-team/keras/releases. Accessed 2026-05-24.
[^6]: Keras Team. "Keras 3.13.0 release notes (Python 3.11+ requirement)." GitHub, 2025. https://github.com/keras-team/keras/releases. Accessed 2026-05-24.
[^7]: Chollet, Francois. "An interview with Francois Chollet." PyImageSearch (Adrian Rosebrock), 2018-07-02. https://pyimagesearch.com/2018/07/02/an-interview-with-francois-chollet/. Accessed 2026-05-24.
[^8]: Bhutani, Sanyam. "Interview with the Creator of Keras, AI Researcher: Francois Chollet." Hacker Noon / DSNet, 2018. https://hackernoon.com/interview-with-the-creator-of-keras-ai-researcher-fran%C3%A7ois-chollet-823cf1099b7c. Accessed 2026-05-24.
[^9]: Chollet, Francois. "About Keras." Keras documentation. https://keras.io/about/. Accessed 2026-05-24.
[^10]: Anandkumar, Anima (for Google). "Farewell and thank you for the continued partnership, Francois Chollet." Google Developers Blog, 2024-11. https://developers.googleblog.com/en/farewell-and-thank-you-for-the-continued-partnership-francois-chollet/. Accessed 2026-05-24.
[^11]: Anderson, Tim. "Another one bites the dust? Keras team steps away from multi-backends, refocuses on tf.keras." DevClass, 2019-09-18. https://devclass.com/2019/09/18/another-one-bites-the-dust-keras-team-steps-away-from-multi-backends-refocuses-on-tf-keras/. Accessed 2026-05-24.
[^12]: Srivastava, Vatsal. "Keras without NVIDIA GPUs with PlaidML (and AMD GPU)." Medium, 2018-09. https://medium.com/@Vatsal410/keras-without-nvidia-gpus-with-plaidml-and-amd-gpu-4ba6f60025ce. Accessed 2026-05-24.
[^13]: Keras Team. "Keras backend configuration (~/.keras/keras.json)." Keras documentation. https://keras.io/getting_started/. Accessed 2026-05-24.
[^14]: Chollet, Francois. "Keras 2.3.0 release: first release of multi-backend Keras with TF 2.0 support; last major release of multi-backend Keras." X (Twitter), 2019-09-18. https://x.com/fchollet/status/1174018651449544704. Accessed 2026-05-24.
[^15]: TensorFlow Team. "TensorFlow 2.0 is now available!" The TensorFlow Blog, 2019-09-30. https://blog.tensorflow.org/2019/09/tensorflow-20-is-now-available.html. Accessed 2026-05-24.
[^16]: Google. "Keras: The high-level API for TensorFlow." TensorFlow Core Guide. https://www.tensorflow.org/guide/keras. Accessed 2026-05-24.
[^17]: Sanderson, Martin. "Standardizing on Keras: Guidance on High-level APIs in TensorFlow 2.0." The TensorFlow Blog, 2018-12-19. https://blog.tensorflow.org/2018/12/standardizing-on-keras-guidance.html. Accessed 2026-05-24.
[^18]: Mehreen, Aisha. "Keras 3.0: Everything You Need To Know." KDnuggets, 2023-07. https://www.kdnuggets.com/2023/07/keras-30-everything-need-know.html. Accessed 2026-05-24.
[^19]: Chollet, Francois. "Announcement: The development of Keras 3 has moved to keras-team/keras and the project has been renamed Keras (from Keras Core)." X (Twitter), 2023-09-21. https://x.com/fchollet/status/1705304039985168461. Accessed 2026-05-24.
[^20]: Chollet, Francois. "Big news: we just released Keras 3.0! Run Keras on top of JAX, TensorFlow, and PyTorch." X (Twitter), 2023-11-28. https://x.com/fchollet/status/1729512791894012011. Accessed 2026-05-24.
[^21]: Keras Team. "Keras 3 benchmarks." keras.io. https://keras.io/getting_started/benchmarks/. Accessed 2026-05-24.
[^22]: Keras Team / Intel. "v3.8.0 release notes: OpenVINO backend available as inference-only Keras backend." GitHub, keras-team/keras, 2025-01. https://github.com/keras-team/keras/releases/tag/v3.8.0. Accessed 2026-05-24.
[^23]: TensorFlow Team. "What's new in TensorFlow 2.16 (Keras 3 default; tf_keras compatibility)." The TensorFlow Blog, 2024-03. https://blog.tensorflow.org/2024/03/whats-new-in-tensorflow-216.html. Accessed 2026-05-24.
[^24]: Chollet, Francois. "Some personal news: I'm leaving Google to go start a new company with a friend." X (Twitter), 2024-11-14. https://x.com/fchollet/status/1857012265024696494. Accessed 2026-05-24.
[^25]: Chollet, Francois. *Deep Learning with Python*, Second Edition. Manning Publications, 2021. ISBN 9781617296864. https://www.manning.com/books/deep-learning-with-python-second-edition. Accessed 2026-05-24.
[^26]: Chollet, Francois, and Matthew Watson. *Deep Learning with Python*, Third Edition. Manning Publications, 2025. ISBN 9781633436589. https://www.manning.com/books/deep-learning-with-python-third-edition. Accessed 2026-05-24.
[^27]: Chollet, Francois. "Xception: Deep Learning with Depthwise Separable Convolutions." Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1800 to 1807, Honolulu, 2017. arXiv:1610.02357. https://arxiv.org/abs/1610.02357. Accessed 2026-05-24.
[^28]: Chollet, Francois. "On the Measure of Intelligence." arXiv:1911.01547, 2019-11-05. https://arxiv.org/abs/1911.01547. Accessed 2026-05-24.
[^29]: ARC Prize Foundation. "ARC Prize." arcprize.org, launched 2024-06. https://arcprize.org/. Accessed 2026-05-24.
[^30]: Patel, Dwarkesh. "Francois Chollet, Mike Knoop: LLMs won't lead to AGI; $1,000,000 Prize to find true solution." Dwarkesh Patel Podcast, 2024-06-12. https://www.dwarkesh.com/p/francois-chollet. Accessed 2026-05-24.
[^31]: Knoop, Mike, and Francois Chollet. "About Ndea." ndea.com, 2025. https://ndea.com/. Accessed 2026-05-24.
[^32]: Bowne-Anderson, Hugo. "An Interview with Francois Chollet." DataCamp, 2017-12. https://www.datacamp.com/blog/an-interview-with-francois-chollet. Accessed 2026-05-24.
[^33]: Keras Team. "About Keras 3 (design principles)." keras.io. https://keras.io/about/. Accessed 2026-05-24.
[^34]: Keras Team. "The Sequential model, The Functional API, and Making new layers and models via subclassing." Keras developer guides. https://keras.io/guides/sequential_model/, https://keras.io/guides/functional_api/, https://keras.io/guides/making_new_layers_and_models_via_subclassing/. Accessed 2026-05-24.
[^35]: Manral, Aishwarya. "Guide: When to use which method: Sequential model, Functional API or Model Subclassing (Keras)." Medium, 2021. https://manralai.medium.com/guide-when-to-use-which-method-sequential-model-functional-api-model-or-model-subclassing-keras-dd1447ab403d. Accessed 2026-05-24.
[^36]: Keras Team. "Keras ops API reference (keras.ops)." keras.io. https://keras.io/api/ops/. Accessed 2026-05-24.
[^37]: Keras Team. "Writing a training loop from scratch in JAX." Keras developer guides. https://keras.io/guides/writing_a_custom_training_loop_in_jax/. Accessed 2026-05-24.
[^38]: Keras Team. "tf.keras.utils.PyDataset / keras.utils.PyDataset API reference." Keras documentation. https://keras.io/api/utils/python_utils/. Accessed 2026-05-24.
[^39]: Keras Team. "Distributed training with Keras 3 (keras.distribution)." Keras developer guides. https://keras.io/guides/distribution/. Accessed 2026-05-24.
[^40]: Keras Team. "Keras Applications." keras.io. https://keras.io/api/applications/. Accessed 2026-05-24.
[^41]: Keras Team. "Keras Layers API." keras.io. https://keras.io/api/layers/. Accessed 2026-05-24.
[^42]: Keras Team. "Training and evaluation with the built-in methods." Keras developer guides. https://keras.io/guides/training_with_built_in_methods/. Accessed 2026-05-24.
[^43]: Keras Team. "Callbacks API." keras.io. https://keras.io/api/callbacks/. Accessed 2026-05-24.
[^44]: Keras Team. "EarlyStopping callback documentation." keras.io. https://keras.io/api/callbacks/early_stopping/. Accessed 2026-05-24.
[^45]: Keras Team. "ModelCheckpoint callback documentation." keras.io. https://keras.io/api/callbacks/model_checkpoint/. Accessed 2026-05-24.
[^46]: Keras Team. "TensorBoard callback documentation." keras.io. https://keras.io/api/callbacks/tensorboard/. Accessed 2026-05-24.
[^47]: Watson, Matthew, et al. "Introducing Keras Hub: Your one-stop shop for pretrained models." Google Developers Blog, 2024-10. https://developers.googleblog.com/en/introducing-keras-hub-for-pretrained-models/. Accessed 2026-05-24.
[^48]: Chollet, Francois. "We're launching KerasHub, a consolidation of KerasNLP and KerasCV into a single unified package." X (Twitter), 2024-10-22. https://x.com/fchollet/status/1848800260115906716. Accessed 2026-05-24.
[^49]: Keras Team. "KerasHub documentation." keras.io. https://keras.io/keras_hub/. Accessed 2026-05-24.
[^50]: Keras Team. "KerasNLP is now KerasHub (issue #1831 on keras-team/keras-hub)." GitHub, 2024-10. https://github.com/keras-team/keras-hub/issues/1831. Accessed 2026-05-24.
[^51]: Keras Team. "KerasTuner documentation." keras.io. https://keras.io/keras_tuner/. Accessed 2026-05-24.
[^52]: Jin, Haifeng, Qingquan Song, and Xia Hu. "AutoKeras: An AutoML Library for Deep Learning." *Journal of Machine Learning Research* 24 (2023), pp. 1 to 6. https://www.jmlr.org/papers/volume24/20-1355/20-1355.pdf. Accessed 2026-05-24.
[^53]: AutoKeras Team. "AutoKeras: AutoML library for deep learning." autokeras.com (first released November 2017). https://autokeras.com/. Accessed 2026-05-24.
[^54]: Lewis, Hayden. "Why Keras is the Leading Deep Learning API." Built In / Towards Data Science, 2020. https://builtin.com/artificial-intelligence/why-keras-leading-deep-learning-api. Accessed 2026-05-24.
[^55]: Verhulsdonck, AltexSoft. "The Good and Bad of Keras Deep Learning API." AltexSoft, 2022. https://www.altexsoft.com/blog/keras-pros-and-cons/. Accessed 2026-05-24.
[^56]: Goswami, Subrata. "The Shortcomings of Keras." Medium, 2019. https://whatdhack.medium.com/the-shortcomings-of-keras-98f8a7eca8ae. Accessed 2026-05-24.
[^57]: Wardat, Mohammad, Wei Le, and Hridesh Rajan. "DeepLocalize: Fault Localization for Deep Neural Networks." Proceedings of the 43rd International Conference on Software Engineering (ICSE), 2021. arXiv:2103.03376. https://arxiv.org/abs/2103.03376. Accessed 2026-05-24.
[^58]: He, Horace. "The State of Machine Learning Frameworks in 2019." The Gradient, 2019. https://thegradient.pub/state-of-ml-frameworks-2019-pytorch-dominates-research-tensorflow-dominates-industry/. Accessed 2026-05-24.
[^59]: DistantJob. "Keras vs PyTorch in 2025: The Comparison." DistantJob, 2025. https://distantjob.com/blog/keras-vs-pytorch/. Accessed 2026-05-24.
[^60]: PyTorch Lightning Team. "PyTorch Lightning documentation (LightningModule and Trainer)." Lightning AI. https://lightning.ai/docs/pytorch/stable/. Accessed 2026-05-24.
[^61]: Howard, Jeremy, and Sylvain Gugger. "fastai: A Layered API for Deep Learning." *Information* 11(2):108, 2020. arXiv:2002.04688. https://arxiv.org/abs/2002.04688. Accessed 2026-05-24.

