A neural network is an artificial intelligence system modeled after the structure of the brain. It is made up of neurons that pass signals to each other, much like how the brain passes electrical and chemical signals. In the artificial neural networks, these signals are known as “action potentials”, which is a spike in the electricity along the cell membrane of a neuron. This is known as the “all or nothing principle” because the action potential either happens or it doesn’t – there’s no in between. These connections between neurons in a neural network have strengths, which is represented by the phrase “neurons that fire together, wire together”. This is attributed to the Canadian neuropsychologist Donald Hebb.
The goal of creating a neural network is to create an “artificial general intelligence”, which means a program that can learn anything you or I can learn. However, currently, neural networks are only good at performing singular tasks, like classifying images and speech.

Neurons have the ability to send signals to each other through strong or weak connections, which can be thought of as “on” or “off”. When two neurons are connected strongly, one neuron sending a signal (action potential) will cause the other neuron to also send a signal, which can then be passed on to other neurons. However, if the connection between the two neurons is weak, the signal may not be strong enough to cause the other neuron to send a signal. This is similar to the concept of binary classification in digital computers, where a problem is classified as either yes / no, true / false, or 0 / 1. This type of classification is found in the machine learning algorithm known as logistic regression.

The above image is a visual representation of the logistic regression model. Logistic regression is a type of machine learning algorithm that takes in a set of inputs, x1, x2, and x3, and outputs another signal, y. This output is a combination of the inputs, weighted by the strength of the inputs to the output neuron. In this book, we will ignore the bias term, since it can easily be included in the given formula by adding an extra dimension x0 which is always equal to 1.
To calculate y from x, we use the following equation: y = sigmoid(w1x1 + w2x2 + w3*x3). The sigmoid function is defined as: sigmoid(x) = 1 / (1 + exp(-x)). When this function is plotted, it looks like a curve with an ‘S’ shape. It takes values between 0 and 1, depending on the input x. The output of the logistic regression model is a probability of a certain outcome or event occurring.

The output of a sigmoid is always between 0 and 1 and is characterized by two asymptotes: the output is exactly 1 when the input is + infinity, and the output is exactly 0 when the input is – infinity. The output is 0.5 when the input is 0. This output can be interpreted as a probability, specifically as the probability P(Y=1 | X) – the probability that Y is equal to 1 given X. This output is commonly referred to as the “output” of the neuron. To create a neural network, neurons are connected together in a feedforward fashion, where the output of one neuron is used as the input for the next. This connection allows the neurons to pass information between each other, allowing the neural network to process complex data.

A logistic unit is a basic building block of a neural network. It takes two inputs (x1 and x2) and produces one output (z1). There may be multiple logistic units in a neural network, which are all connected to form a “hidden layer”. A deeper neural network would have more than one hidden layer. “Deep learning” is a term used to describe any neural network with one or more hidden layers. In production systems, neural networks may have tens or hundreds of layers.
To calculate the output of a neural network Y, we must use the logistic units as building blocks. We start by calculating the output of each logistic unit (z1, z2, etc.) based on its inputs. We then use these outputs to calculate the final output of the network, which is Y.