Neural network: Difference between revisions

m
No edit summary
 
(2 intermediate revisions by the same user not shown)
Line 2: Line 2:
==Introduction==
==Introduction==
[[File:Ann.png|thumb|Figure 1. Input, hidden, and output layers of an ANN. Source: Nahar (2012).]]
[[File:Ann.png|thumb|Figure 1. Input, hidden, and output layers of an ANN. Source: Nahar (2012).]]
An artificial neural network (ANN), or just neural network (NN) for simplicity, is a massively parallel distributed processor made up of simple, interconnected processing units. It is an information processing paradigm – a computing system - inspired by biological nervous systems (e.g. the brain) and how they process information, where a large number of highly interconnected processing units work in unison to solve specific problems. The scale of an artificial neural network is smaller when compared to their biological counterpart. For example, a large ANN might have hundreds or thousands of processor units while a biological nervous system (e.g. a mammalian brain) has billions of neurons <ref name="”1”">Zaytsev, O. (2016). A Concise Introduction to Machine Learning with Artificial Neural Networks. Retrieved from http://www.academia.edu/25708860/A_Concise_Introduction_to_Machine_Learning_with_Artificial_Neural_Networks</ref> <ref name="”2”">Nahar, K. (2012). Artificial Neural Network. COMPUSOFT, 1(2): 25-27</ref> <ref name="”3”">A basic introduction to neural networks. Retrieved from http://pages.cs.wisc.edu/~bolo/shipyard/neural/local.html</ref>. Neural networks - a set of algorithms designed to recognize patterns - interpret data, labeling or clustering raw input. They recognize numerical patterns contained in vectors, into which all real-world data, such as images, sound, text, or time series need to be translated <ref name="”4”">Deeplearning4j. Introduction to deep neural networks. Retrieved from https://deeplearning4j.org/neuralnet-overview.html#introduction-to-deep-neural-networks</ref>.
[[Neural network]]s are [[machine learning]] [[algorithm]]s [[model]]ed after the structure and function of the human brain, designed to recognize patterns and make decisions based on [[input data]]. An artificial neural network (ANN), or just neural network (NN) for simplicity, is a massively parallel distributed processor made up of simple, interconnected processing units. It is an information processing paradigm – a computing system - inspired by biological nervous systems (e.g. the brain) and how they process information, where a large number of highly interconnected processing units work in unison to solve specific problems. The scale of an artificial neural network is smaller when compared to their biological counterpart. For example, a large ANN might have hundreds or thousands of processor units while a biological nervous system (e.g. a mammalian brain) has billions of neurons <ref name="”1”">Zaytsev, O. (2016). A Concise Introduction to Machine Learning with Artificial Neural Networks. Retrieved from http://www.academia.edu/25708860/A_Concise_Introduction_to_Machine_Learning_with_Artificial_Neural_Networks</ref> <ref name="”2”">Nahar, K. (2012). Artificial Neural Network. COMPUSOFT, 1(2): 25-27</ref> <ref name="”3”">A basic introduction to neural networks. Retrieved from http://pages.cs.wisc.edu/~bolo/shipyard/neural/local.html</ref>. Neural networks - a set of algorithms designed to recognize patterns - interpret data, labeling or clustering raw input. They recognize numerical patterns contained in vectors, into which all real-world data, such as images, sound, text, or time series need to be translated <ref name="”4”">Deeplearning4j. Introduction to deep neural networks. Retrieved from https://deeplearning4j.org/neuralnet-overview.html#introduction-to-deep-neural-networks</ref>.


The simple processing units of an ANN are, in loose terms, the artificial equivalent of their biological counterpart, the neurons. Biological neurons receive signals through synapses. When the signals are strong enough and surpass a certain threshold, the neuron is activated, emitting a signal through the axon that might be directed to another synapse <ref name="”4”" /> <ref name="”5”">Kriesel, D. (2007). A Brief Introduction to Neural Networks. Retrieved from http://www.dkriesel.com</ref> <ref name="”6”">Gershenson, C. (2003). Artificial neural networks for beginners. arXiv:cs/0308031v1 [cs.NE]</ref>. According to Gershenson (2003), the nodes (artificial neurons) “consist of inputs (like synapses), which are multiplied by weights (strength of the respective signals), and then computed by a mathematical function which determines the activation of the neuron. Another function (which may be the identity) computes the output of the artificial neuron (sometimes in dependence of a certain threshold). ANNs combine artificial neurons in order to process information <ref name="”6”" />.”
The simple processing units of an ANN are, in loose terms, the artificial equivalent of their biological counterpart, the neurons. Biological neurons receive signals through synapses. When the signals are strong enough and surpass a certain threshold, the neuron is activated, emitting a signal through the axon that might be directed to another synapse <ref name="”4”" /> <ref name="”5”">Kriesel, D. (2007). A Brief Introduction to Neural Networks. Retrieved from http://www.dkriesel.com</ref> <ref name="”6”">Gershenson, C. (2003). Artificial neural networks for beginners. arXiv:cs/0308031v1 [cs.NE]</ref>. According to Gershenson (2003), the nodes (artificial neurons) “consist of inputs (like synapses), which are multiplied by weights (strength of the respective signals), and then computed by a mathematical function which determines the activation of the neuron. Another function (which may be the identity) computes the output of the artificial neuron (sometimes in dependence of a certain threshold). ANNs combine artificial neurons in order to process information <ref name="”6”" />.”
Line 10: Line 10:
The nodes (a place where computation happens) are organized in layers. They combine input from the data with weights (or coefficients), as mentioned above, amplifying or dampening the input, and consequently assigning significance to inputs for the task the algorithm is trying to learn. The input weight products are then summed and the result is passed through a node’s activation function in order to determine if a signal progresses further through the network and, if so, to what extent. The pairing of adjustable weights with input features is how significance is assigned to those features with regard to how the network classifies and clusters input <ref name="”4”" />.
The nodes (a place where computation happens) are organized in layers. They combine input from the data with weights (or coefficients), as mentioned above, amplifying or dampening the input, and consequently assigning significance to inputs for the task the algorithm is trying to learn. The input weight products are then summed and the result is passed through a node’s activation function in order to determine if a signal progresses further through the network and, if so, to what extent. The pairing of adjustable weights with input features is how significance is assigned to those features with regard to how the network classifies and clusters input <ref name="”4”" />.


A node layer is a row of the artificial neurons that turn on or off as the input passes through the net (figure 1). The output of each layer is the subsequent layer’s input. The process starts from an initial input layer that receives the data. The number of input and output nodes in an ANN depends on the problem to which the network is being applied. Conversely, there are no fixed rules as to how many nodes the hidden layer should have. If it has few nodes, the network might have difficulty generalizing to problems it has never encountered before; if there are too many nodes, the network may take a long time to learn anything of value <ref name="”4”" />.
A node layer is a row of the artificial neurons that turn on or off as the input passes through the net (figure 1). The output of each layer is the subsequent layer’s input. The process starts from an initial input layer that receives the data. The number of input and output nodes in an ANN depends on the problem to which the network is being applied. Conversely, there are no fixed rules as to how many nodes the [[hidden layer]] should have. If it has few nodes, the network might have difficulty generalizing to problems it has never encountered before; if there are too many nodes, the network may take a long time to learn anything of value <ref name="”4”" />.


An efficient way to solve complex problems is to decompose the complex system into simpler elements to be able to understand it. On the other side, simple elements can be gathered to produce a complex system. The network structure is one approach to achieve this. Even though there are a number of different types of networks, they can be generalized as having the following components: a set of nodes and connections between them. These can be seen as the computational units, receiving inputs and processing them to obtain an output. The complexity of the processing can vary. It can be simple, like summing the inputs, or complex, in which a node might contain another network, for example. The interactions of the nodes through the connections between them lead to a global behavior of the network. This behavior cannot be observed in the single elements that form the network. This is called an emergent behavior, in which the abilities of the network as a whole supersede the ones of its constitutive elements <ref name="”6”" />.
An efficient way to solve complex problems is to decompose the complex system into simpler elements to be able to understand it. On the other side, simple elements can be gathered to produce a complex system. The network structure is one approach to achieve this. Even though there are a number of different types of networks, they can be generalized as having the following components: a set of nodes and connections between them. These can be seen as the computational units, receiving inputs and processing them to obtain an output. The complexity of the processing can vary. It can be simple, like summing the inputs, or complex, in which a node might contain another network, for example. The interactions of the nodes through the connections between them lead to a global behavior of the network. This behavior cannot be observed in the single elements that form the network. This is called an emergent behavior, in which the abilities of the network as a whole supersede the ones of its constitutive elements <ref name="”6”" />.
Line 83: Line 83:


==The backpropagation algorithm==
==The backpropagation algorithm==
The [[backpropagation]] algorithm is used in layered feedforward ANNs, and it is one of the most popular algorithms. The artificial neurons, organized in layers, send their signals “forward” and the errors are propagated backwards. This algorithm uses supervised learning, in which examples of the inputs and outputs that are intended for the network compute are provided. The error, which is the difference between actual and expected results, is then calculated. The goal of the backpropagation algorithm is to reduce the error, until the ANN learns the training data. The training begins with random weights, and the objective is to adjust them to achieve a minimal error level. Resuming, the backpropagation algorithm can be broken down to four main steps: 1) feedforward computation; 2) back propagation to the output layer; 3) back propagation to the hidden layer, and 4) weight updates <ref name="”6”" /> <ref name="”8”" />.


The backpropagation algorithm is used in layered feedforward ANNs, and it is one of the most popular algorithms. The artificial neurons, organized in layers, send their signals “forward” and the errors are propagated backwards. This algorithm uses supervised learning, in which examples of the inputs and outputs that are intended for the network compute are provided. The error, which is the difference between actual and expected results, is then calculated. The goal of the backpropagation algorithm is to reduce the error, until the ANN learns the training data. The training begins with random weights, and the objective is to adjust them to achieve a minimal error level. Resuming, the backpropagation algorithm can be broken down to four main steps: 1) feedforward computation; 2) back propagation to the output layer; 3) back propagation to the hidden layer, and 4) weight updates <ref name="”6”" /> <ref name="”8”" />.
==Explain Like I'm 5 (ELI5)==
Neural networks are computer programs designed to mimic the human brain's workings. Just as our brain contains millions of tiny [[neuron]]s working together in synapses to facilitate thoughts and decision-making, a neural network contains many small components called [[artificial neuron]]s working together in order to solve problems.
 
When we learn something new, like how to ride a bike, our brain changes the connections between neurons in order to retain what was learned. In a similar fashion, an artificial neural network can alter its artificial neurons' connections in order to learn from data and make better decisions.
 
For instance, if we want the neural network to recognize photos of cats, then we must show it an extensive library of images of cats and explain what makes a cat a cat. After viewing enough [[examples]], even if it has never seen that particular breed before, the neural network should be able to accurately recognize a picture of a cat as one.
 
Neural networks are like superhuman computer brains that learn and make decisions independently!


==References==
==References==
<references />
<references />


[[Category:Terms]] [[Category:Machine learning terms]]
[[Category:Terms]] [[Category:Machine learning terms]] [[Category:not updated]]