Interface administrators, Administrators (Semantic MediaWiki), Curators (Semantic MediaWiki), Editors (Semantic MediaWiki), Suppressors, Administrators
7,785
edits
No edit summary |
No edit summary |
||
Line 1: | Line 1: | ||
{{see also|Machine learning terms}} | |||
==Introduction== | ==Introduction== | ||
In machine learning, a deep model is an artificial neural network composed of multiple layers. These networks are designed to learn representations of data that become increasingly abstract and complex as it progresses through each layer. Deep models have been employed in order to achieve top-of-the-art performance on various tasks such as image and speech recognition, natural language processing, and game playing. | In [[machine learning]], a [[deep model]] is an artificial [[neural network]] composed of multiple [[layers]] (more than 1 [[hidden layer]]). These networks are designed to learn representations of [[data]] that become increasingly abstract and complex as it progresses through each layer. Deep models have been employed in order to achieve top-of-the-art performance on various tasks such as [[image recognition|image]] and [[speech recognition]], [[natural language processing]], and game playing. | ||
*Deep model is also known as the [[deep neural network]]. | |||
==Background== | ==Background== | ||
Artificial neural | [[Artificial neural network]]s (ANNs) are [[machine learning models]] inspired by the structure and function of the human brain. They consist of interconnected nodes, known as [[neuron]]s, organized into layers. Each neuron takes [[input]] from neurons in its previous layer, applies a mathematical [[function]] to it, then produces an [[output]] which is transmitted onto subsequent layers. | ||
Early neural networks, such as the | Early neural networks, such as the [[perception]] and multilayer perceptron, were composed of only one or two layers of neurons. As a result, these models were limited in their capacity to learn complex connections between inputs and outputs. | ||
Deep models, on the other hand, are distinguished by their depth - that is, how many layers they contain. These models typically boast multiple times more layers than early neural networks and may consist of tens to hundreds or even thousands. Popular examples of deep models include Convolutional Neural | Deep models, on the other hand, are distinguished by their depth - that is, how many layers they contain. These models typically boast multiple times more layers than early neural networks and may consist of tens to hundreds or even thousands. Popular examples of deep models include [[Convolutional Neural Network]]s (CNNs), [[Recurrent Neural Network]]s (RNNs) and [[Deep Belief Network]]s (DBNs). | ||
==Architecture== | ==Architecture== | ||
Deep models typically consist of an input layer, multiple hidden | Deep models typically consist of an [[input layer]], multiple [[hidden layer]]s and an [[output layer]]. Each of these contains neurons that perform computations on the input data they receive. The neurons in the input layer receive raw data such as images or sequences of words and pass it along to their counterparts in the first hidden layer. As more complex computations take place on this input data from previous layers, higher-level [[features]] begin to emerge from it. Finally, the output layer produces final predictions or [[classification]]s based on what has been learned through hidden layers. | ||
==Training== | ==Training== | ||
Deep models are trained using backpropagation, a type of supervised learning. During training, the model is presented with labeled | Deep models are trained using [[backpropagation]], a type of [[supervised learning]]. During [[training]], the [[model]] is presented with [[labeled example]]s or [[training data]] and it adjusts its neurons' [[parameters]] to minimize the difference between predicted outputs and true [[label]]s. To do this, it computes the [[gradient]] error with respect to each parameter's value and uses that information as input into an [[optimization algorithm]] for updating those variables accordingly. | ||
Training a deep model can be an intensive computational task, particularly for large datasets and complex architectures. One popular technique to expedite training is mini-batch stochastic gradient descent, which involves randomly selecting a small subset of the training data for each update to the model's parameters. | Training a deep model can be an intensive computational task, particularly for large datasets and complex architectures. One popular technique to expedite training is mini-batch [[stochastic gradient descent]], which involves randomly selecting a small subset of the training data for each update to the model's parameters. | ||
==Applications== | ==Applications== | ||
Deep models have been widely applied to machine learning tasks such as image and speech recognition, natural language processing, and game playing. They achieved state-of-the-art performance on several benchmark datasets such as ImageNet for image classification, MS COCO for object detection, and LibriSpeech for speech recognition. Furthermore, deep models are utilized in natural language processing tasks like machine translation, sentiment analysis, and question answering. | Deep models have been widely applied to machine learning tasks such as image and speech recognition, natural language processing, and game playing. They achieved state-of-the-art performance on several benchmark datasets such as [[ImageNet]] for [[image classification]], [[MS COCO]] for [[object detection]], and [[LibriSpeech]] for speech recognition. Furthermore, deep models are utilized in natural language processing tasks like [[machine translation]], [[sentiment analysis]], and [[question and answering]]. | ||
==Explain Like I'm 5 (ELI5)== | ==Explain Like I'm 5 (ELI5)== | ||
A deep model is a type of computer program that attempts to learn from examples. For instance, if you wanted your robot to know the difference between dogs and cats, showing it lots of pictures would help it uncover what makes each unique. A deep model works like an intricate brain with many layers, looking at pictures over and over again in order to change its neurons' connections between neurons so it becomes better at distinguishing dogs and cats. | A deep model is a type of computer program that attempts to learn from [[examples]]. For instance, if you wanted your robot to know the difference between dogs and cats, showing it lots of pictures would help it uncover what makes each unique. A deep model works like an intricate brain with many layers, looking at pictures over and over again in order to change its neurons' connections between neurons so it becomes better at distinguishing dogs and cats. | ||
[[Category:Terms]] [[Category:Machine learning terms]] |