Machine learning terms
- See also: Terms and Machine learning
Fundamentals
- See also: Machine learning terms/Fundamentals
- accuracy
- activation function
- artificial intelligence
- AUC (Area Under the Curve)
- backpropagation
- batch
- batch size
- bias
- bias (ethics/fairness)
- binary classification
- bucketing
- categorical data
- class
- classification model
- classification threshold
- class-imbalanced dataset
- clipping
- confusion matrix
- continuous feature
- convergence
- DataFrame
- dataset
- deep model
- dense feature
- depth
- discrete feature
- dynamic
- dynamic model
- early stopping
- embedding layer
- epoch
- example
- false negative (FN)
- false positive (FP)
- false positive rate (FPR)
- feature
- feature cross
- feature engineering
- feature set
- feature vector
- feedback loop
- generalization
- generalization curve
- gradient descent
- ground truth
- hidden layer
- hyperparameter
- independently and identically distributed (i.i.d.)
- inference
- input layer
- interpretability
- iteration
- L0 regularization
- L1 loss
- L1 regularization
- L2 loss
- L2 regularization
- label
- labeled example
- lambda
- layer
- learning rate
- linear model
- linear
- linear regression
- logistic regression
- Log Loss
- log-odds
- loss
- loss curve
- loss function
- machine learning
- majority class
- mini-batch
- minority class
- model
- multi-class classification
- negative class
- neural network
- neuron
- node (neural network)
- nonlinear
- nonstationarity
- normalization
- numerical data
- offline
- offline inference
- one-hot encoding
- one-vs.-all
- online inference
- online learning
- output layer
- overfitting
- pandas
- parameter
- positive class
- post-processing
- prediction
- proxy labels
- rater
- Rectified Linear Unit (ReLU)
- regression model
- regularization
- regularization rate
- ReLU
- ROC (receiver operating characteristic) Curve
- Root Mean Squared Error (RMSE)
- sigmoid function
- softmax
- sparse feature
- sparse representation
- sparse vector
- squared loss
- stability
- static
- static inference
- stationarity
- stochastic gradient descent (SGD)
- supervised machine learning
- synthetic feature
- test loss
- training
- training loss
- training-serving skew
- training set
- true negative (TN)
- true positive (TP)
- true positive rate (TPR)
- underfitting
- unlabeled example
- unsupervised machine learning
- validation
- validation loss
- validation set
- weight
- weighted sum
- Z-score normalization
All
- See also: Machine learning terms/All
- A/B testing
- accuracy
- action
- activation function
- active learning
- AdaGrad
- agent
- agglomerative clustering
- anomaly detection
- AR
- area under the PR curve
- area under the ROC curve
- artificial general intelligence
- artificial intelligence
- attention
- attribute
- attribute sampling
- AUC (Area under the ROC curve)
- augmented reality
- automation bias
- average precision
- axis-aligned condition
- backpropagation
- bagging
- bag of words
- baseline
- batch
- batch normalization
- batch size
- Bayesian neural network
- Bayesian optimization
- Bellman equation
- BERT (Bidirectional Encoder Representations from Transformers)
- bias (ethics/fairness)
- bias (math) or bias term
- bigram
- bidirectional
- bidirectional language model
- binary classification
- binary condition
- binning
- BLEU (Bilingual Evaluation Understudy)
- boosting
- bounding box
- broadcasting
- bucketing
- calibration layer
- candidate generation
- candidate sampling
- categorical data
- causal language model
- centroid
- centroid-based clustering
- checkpoint
- class
- classification model
- classification threshold
- class-imbalanced dataset
- clipping
- Cloud TPU
- clustering
- co-adaptation
- collaborative filtering
- condition
- confirmation bias
- confusion matrix
- continuous feature
- convenience sampling
- convergence
- convex function
- convex optimization
- convex set
- convolution
- convolutional filter
- convolutional layer
- convolutional neural network
- convolutional operation
- cost
- co-training
- counterfactual fairness
- coverage bias
- crash blossom
- critic
- cross-entropy
- cross-validation
- data analysis
- data augmentation
- DataFrame
- data parallelism
- data set or dataset
- Dataset API (tf.data)
- decision boundary
- decision forest
- decision threshold
- decision tree
- deep model
- decoder
- deep neural network
- Deep Q-Network (DQN)
- demographic parity
- denoising
- dense feature
- dense layer
- depth
- depthwise separable convolutional neural network (sepCNN)
- derived label
- device
- dimension reduction
- dimensions
- discrete feature
- discriminative model
- discriminator
- disparate impact
- disparate treatment
- divisive clustering
- downsampling
- DQN
- dropout regularization
- dynamic
- dynamic model
- eager execution
- early stopping
- earth mover's distance (EMD)
- embedding layer
- embedding space
- embedding vector
- empirical risk minimization (ERM)
- encoder
- ensemble
- entropy
- environment
- episode
- epoch
- epsilon greedy policy
- equality of opportunity
- equalized odds
- Estimator
- example
- experience replay
- experimenter's bias
- exploding gradient problem
- fairness constraint
- fairness metric
- false negative (FN)
- false negative rate
- false positive (FP)
- false positive rate (FPR)
- feature
- feature cross
- feature engineering
- feature extraction
- feature importances
- feature set
- feature spec
- feature vector
- federated learning
- feedback loop
- feedforward neural network (FFN)
- few-shot learning
- fine tuning
- forget gate
- full softmax
- fully connected layer
- GAN
- generalization
- generalization curve
- generalized linear model
- generative adversarial network (GAN)
- generative model
- generator
- GPT (Generative Pre-trained Transformer)
- gini impurity
- gradient
- gradient boosting
- gradient boosted (decision) trees (GBT)
- gradient clipping
- gradient descent
- graph
- graph execution
- greedy policy
- ground truth
- group attribution bias
- hallucination
- hashing
- heuristic
- hidden layer
- hierarchical clustering
- hinge loss
- holdout data
- hyperparameter
- hyperplane
- i.i.d.
- image recognition
- imbalanced dataset
- implicit bias
- incompatibility of fairness metrics
- independently and identically distributed (i.i.d)
- individual fairness
- inference
- inference path
- information gain
- in-group bias
- input layer
- in-set condition
- instance
- interpretability
- inter-rater agreement
- intersection over union (IoU)
- IoU
- item matrix
- items
- iteration
- Keras
- keypoints
- Kernel Support Vector Machines (KSVMs)
- k-means
- k-median
- L0 regularization
- L1 loss
- L1 regularization
- L2 loss
- L2 regularization
- label
- labeled example
- LaMDA (Language Model for Dialogue Applications)
- lambda
- landmarks
- language model
- large language model
- layer
- Layers API (tf.layers)
- leaf
- learning rate
- least squares regression
- linear model
- linear
- linear regression
- logistic regression
- logits
- Log Loss
- log-odds
- Long Short-Term Memory (LSTM)
- loss
- loss curve
- loss function
- loss surface
- LSTM
- machine learning
- majority class
- Markov decision process (MDP)
- Markov property
- masked language model
- matplotlib
- matrix factorization
- Mean Absolute Error (MAE)
- Mean Squared Error (MSE)
- metric
- meta-learning
- Metrics API (tf.metrics)
- mini-batch
- mini-batch stochastic gradient descent
- minimax loss
- minority class
- ML
- MNIST
- modality
- model
- model capacity
- model parallelism
- model training
- Momentum
- multi-class classification
- multi-class logistic regression
- multi-head self-attention
- multimodal model
- multinomial classification
- multinomial regression
- NaN trap
- natural language understanding
- negative class
- neural network
- neuron
- N-gram
- NLU
- node (neural network)
- node (TensorFlow graph)
- node (decision tree)
- noise
- non-binary condition
- nonlinear
- non-response bias
- nonstationarity
- normalization
- novelty detection
- numerical data
- NumPy
- objective
- objective function
- oblique condition
- offline
- offline inference
- one-hot encoding
- one-shot learning
- one-vs.-all
- online
- online inference
- operation (op)
- out-of-bag evaluation (OOB evaluation)
- optimizer
- out-group homogeneity bias
- outlier detection
- outliers
- output layer
- overfitting
- oversampling
- pandas
- parameter
- Parameter Server (PS)
- parameter update
- partial derivative
- participation bias
- partitioning strategy
- perceptron
- performance
- permutation variable importances
- perplexity
- pipeline
- pipelining
- policy
- pooling
- positive class
- post-processing
- PR AUC (area under the PR curve)
- precision
- precision-recall curve
- prediction
- prediction bias
- predictive parity
- predictive rate parity
- preprocessing
- pre-trained model
- prior belief
- probabilistic regression model
- proxy (sensitive attributes)
- proxy labels
- Q-function
- Q-learning
- quantile
- quantile bucketing
- quantization
- queue
- random forest
- random policy
- ranking
- rank (ordinality)
- rank (Tensor)
- rater
- recall
- recommendation system
- Rectified Linear Unit (ReLU)
- recurrent neural network
- regression model
- regularization
- regularization rate
- reinforcement learning (RL)
- ReLU
- replay buffer
- reporting bias
- representation
- re-ranking
- return
- reward
- ridge regularization
- RNN
- ROC (receiver operating characteristic) Curve
- root
- root directory
- Root Mean Squared Error (RMSE)
- rotational invariance
- sampling bias
- sampling with replacement
- SavedModel
- Saver
- scalar
- scaling
- scikit-learn
- scoring
- selection bias
- self-attention (also called self-attention layer)
- self-supervised learning
- self-training
- semi-supervised learning
- sensitive attribute
- sentiment analysis
- sequence model
- sequence-to-sequence task
- serving
- shape (Tensor)
- shrinkage
- sigmoid function
- similarity measure
- size invariance
- sketching
- softmax
- sparse feature
- sparse representation
- sparse vector
- sparsity
- spatial pooling
- split
- splitter
- squared hinge loss
- squared loss
- stability
- staged training
- state
- state-action value function
- static
- static inference
- stationarity
- step
- step size
- stochastic gradient descent (SGD)
- stride
- structural risk minimization (SRM)
- subsampling
- summary
- supervised machine learning
- synthetic feature
- tabular Q-learning
- target
- target network
- temporal data
- Tensor
- TensorBoard
- TensorFlow
- TensorFlow Playground
- TensorFlow Serving
- Tensor Processing Unit (TPU)
- Tensor rank
- Tensor shape
- Tensor size
- termination condition
- test
- test loss
- test set
- tf.Example
- tf.keras
- threshold (for decision trees)
- time series analysis
- timestep
- token
- tower
- TPU
- TPU chip
- TPU device
- TPU master
- TPU node
- TPU Pod
- TPU resource
- TPU slice
- TPU type
- TPU worker
- training
- training loss
- training-serving skew
- training set
- trajectory
- transfer learning
- Transformer
- translational invariance
- trigram
- true negative (TN)
- true positive (TP)
- true positive rate (TPR)
- unawareness (to a sensitive attribute)
- underfitting
- undersampling
- unidirectional
- unidirectional language model
- unlabeled example
- unsupervised machine learning
- uplift modeling
- upweighting
- user matrix
- validation
- validation loss
- validation set
- vanishing gradient problem
- variable importances
- Wasserstein loss
- weight
- Weighted Alternating Least Squares (WALS)
- weighted sum
- wide model
- width
- wisdom of the crowd
- word embedding
- Z-score normalization