Traditionally, software has been programmed using deterministic logic—a mesh of “if else” type statements. The problem with this is that you have to program every piece of decision-making logic ahead of time.
AI uses data to make decisions and learns as the data shifts. It’s more scalable and accurate, if you have sufficient training data. You can think of AI as the math, algorithms, and models that make a machine exhibit intelligent or humanlike behavior.
Machine learning approaches
Let’s say you want to make an AI that tells farmers when to plant seeds.
You could use three different approaches in machine learning.
- Classic: Build a model that takes into account all the variables (soil type, crop, rainfall, and so on), weighs them accordingly, and makes a decision.
- Rule-based or expert systems: Get a hundred best farmers to write down every rule about planting they know. Arrange those rules in a way that someone can enter their relevant variables and the system makes a suggestion based on those rules.
- Artificial neural networks: Take all the data related to when farmers plant seeds and what the yields were and the neural network on its own comes up with the rules that maximizes the yields.
Today, artificial neural networks are what we call “AI.” AI figures out the rules for itself via trial and error. It measures its success based on goals that you’ve specified. The goal could be a list of examples to imitate, a game score to increase, or anything else.
As AI connects the many data points to the desired outcome, it discovers deeply buried patterns and correlations that you didn’t even know existed.
The basic methods of AI are artificial neural networks, deep learning, and reinforcement learning:
1. Artificial neural networks
There are lots of ways to build artificial neural networks, each meant for a particular application. But they’re all loosely modeled after the way the human brain works. That’s why they are called artificial neural networks–their cousins, biological neural networks, are the original, far more complex models.
How does the human brain work?
The human brain is the most complex structure that evolution has produced. The human brain consists of at least 100 billion interconnected nerve cells—neurons—linked by 100 trillion connections.
On average, each neuron has more than a thousand connections to other neurons. These connections are called synapses. Neurons and synapses form an unimaginably complicated network that stores information and makes it retrievable using electrical impulses and biochemical messengers.
A neuron transmits information—or more precisely an electrical impulse—to the next cell only when it has reached a certain threshold value for other cells “talking” to it within a certain period of time. Otherwise, the neuron breaks the connection.
The human brain learns in a literal sense through associations and connections. The more often a connection is activated, the more it consolidates the knowledge it learns, and it corrects what it thought it knew when it receives input that it has incorrectly wired this information. By linking many different connections, it can also increasingly form abstractions.
For example, for us to see, the neurons in our brains pick up the images on our retinas and give them shape and meaning. This process of giving meaning begins in the visual cortex which identifies lines and edges. This basic sketch is passed like an assembly line in a factory to the regions in the brain that add shape, spot shadows, and build up noses, eyes, and faces. The final image is put together using information from our memories and language, which helps us classify and categorize images, for example, into different types of dogs and cats.
The computational theory of the mind
It was Warren McCulloch and Walter Pitts, a pair of cyberneticists who pioneered artificial neural networks, who came up with the computational theory of mind. In the early 1940s, the pair had argued that the human mind functioned, on the neural level, much like a Turing machine—an early digital computer. Both manipulated symbols based on predetermined rules. Both utilized feedback loops. The all-or-nothing way in which a neuron either fired or did not fire could be conceived of as a kind of binary code that executed logical propositions. For example, if two neurons, A and B, must both fire in order for a third neuron, C, to fire, this could correspond to “If A and B are true, then C is true.”
In their 1943 paper, McCulloch and Pitts proposed that mathematical operations could realize mental functions. McCullogh announced that the brains “compute thought the way electronic computers calculate numbers.” His theory was vague when it came to the question of how computation gave way to the phenomenon of inner experience—the ability to see, to feel, to have the sensation of self-awareness.
While machines can replicate many of the functional properties of cognition–prediction, pattern-recognition, solving mathematical theorems—these processes are not accompanied by any first-person experience. The computer is merely manipulating symbols, blindly following instructions without understanding the content of those instructions or the concepts the symbols are meant to represent.
The mechanics of neural networks
Artificial neural networks are imitation brains. They’re built from simple chunks of software, each able to perform very simple math. These chunks are called cells or neurons, an analogy with the neurons that make up our own brains. The power of the neural network lies in how these cells are connected. Like the brain, artificial neural networks process different pieces of information in parallel.
An artificial neural network consists of three layers: an input layer, a middle or hidden layer, and an output layer.
A layer is a densely connected graph of simple computing units. Data is fed into the input layer, analyzed, and weighed as it moves through the middle or hidden layers, and it’s then forwarded to the output layer for the result.
Artificial neural networks develop their own strategies based on the examples they are fed—a process called “training.”
Example: If you want to train a neural network to recognize a cat, you feed it tons and tons of random photos, each one attached with a positive or negative reinforcement; positive feedback for cats, negative feedback for non-cats. The neural network uses probabilistic techniques to make “guesses” about what it’s seeing in each photo, and these guesses, with the help of the feedback, gradually become more accurate. The networks essentially evolve their own internal model of a cat and fine-tune their performance as they go.
Basic concepts in artificial neural networks:
- Backpropagation: The results of any measurements—at any individual neuron in the network—is fed back to the prior layers over and over again to continually adjust the weights and measurements based on the overall dynamics of the evolving analysis.
- Supervised learning: The training data is first labeled by humans with the correct classification or output value. This pattern-finding process is easier when the data is labeled with that desired outcome—“cat” vs “no cat”; “click” vs “no click”; “won game” vs “lost game.”
- Unsupervised learning: The training data is not classified or labeled in any way before being fed into the system. The system analyzes the data without any prior guidance or specific goal. It finds hidden commonalities within broad sets of complex and varied data.
2. Deep learning
Deep learning emerged by studying single-cell recordings of visual processing within mammalian brains using electrophysiological and optical methods. Apparently, after we are exposed to input, we process information in progressively higher–or “deeper”--levels as time goes on.
Deep neural networks often include many middle or hidden layers; sometimes in the thousands.
Deep neural networks have more layers and so provide more opportunities to transform an initial vector encoding of the input into a series of incrementally more abstract, more concise, and more insightful features.
Example: Convolutional neural networks (convnets) are a specialized form of deep neural networks devoted mainly to vision. To train a convnet, you feed in millions of images from a database like imagenet: which has more than 14 million URLs of images annotated by hand to specify the content. The convnet first picks out the lines and edges of the image. Each layer in succession picks out more and more details. In the last layer all the features are assembled into the final image. The machine does not see the image like we do–only a set of numbers.
3. Reinforcement learning
Reinforcement learning is an algorithm that’s designed to determine what methods maximize future rewards. Reinforcement learning models identify and match cause and effect in processes. This is how humans and animals learn to repeat actions that bring about reward.
- Reinforcement learning: Reinforcement learning is similar to unsupervised learning in that the training data isn’t labeled. But when a system draws a conclusion about the data, the outcome is graded or rewarded, and then based on that the algorithm learns what actions are most valued.
Latest developments in AI
Today, there is a convergence between neural networks, deep learning, and reinforcement learning. They are often used together, giving rise to fields such as “Deep Reinforcement Learning.”
Neural networks, deep learning, and reinforcement learning don’t capture the whole picture of how machines think and react in a human-like way. AI research has gone much further to achieve human-like thinking. These developments include:
- Attention mechanisms: Until recently, most AI models processed an entire image or video frame by assigning equal priority to all image pixels. However, humans process visual information differently. Instead of processing everything at the same time, visual attention moves intelligently around different locations and objects. Each section of an image is given attention and processed according to its relevance and usage. By adopting efficient attention strategies of information extraction, AI better replicates the ways our brains identify and recognize what we see.
- Episodic memory: One of the established principles in neuroscience is that intelligent behavior is because of multiple memory systems. We employ different parts of the brain, or unique memory systems, depending on the type of memory we’re required to access. Imitating the brain requires researchers to develop multiple memory systems, which are accessed and combined when needed. To do this, researchers developed an artificial agent termed Deep Q-networks (DQN). Their approach combines incremental learning
about the value of certain events with instance-based learning
of “one-off” events that would normally be stored in our episodic memory. In this way, the learning process is dual. Some experiences are stored in memory and used to slowly adjust the deep network’s optimal policy. Other experiences, those that the algorithm identifies as particularly significant, are learned and stored “at once” and make rapid changes in system behavior if matched with a certain situation.
- Imagination: In humans, a part of the brain called the hippocampus binds together multiple objects, actions, and agents to create an imagined scenario that is coherent time-wise and space-wise. Researchers have developed architectures that reflect different outcomes of possible scenarios. With this, AI can achieve human-like performance in dynamic, adversarial networks.
When is AI usually successful?
When you have massive amounts of relevant data, a narrow domain, and a specific goal. If you’re short on any one of these things, things fall apart.
At its most basic, all AI needs is a goal and a set of data to learn from and it’s off to the races; whether the goal is to copy examples of loan decisions a human made or predict whether a customer will buy a certain sock or maximize the score in a video game or maximize the distance a robot can travel. In every scenario, AI uses trial and error to invent rules that will help reach its goal.
The difference between successful AI problem solving and failure usually has to do with the suitability of the task for an AI solution. And there are plenty of tasks for which AI solutions are more efficient than human solutions.
The narrower the task, the smarter the AI.
This is why AI researchers like to draw a distinction between.
- Artificial narrow intelligence (ANI): AI takes data from one specific domain and applies it optimize one specific outcome. It's the kind we have now. A Roomba vacuum cleaner, Siri, and a self-driving car are examples of narrow AI.
- Artificial general intelligence (AGI): AI that can do everything a human can. An AGI could beat you at chess, tell you a story, bake you a cake, describe a sheep, and name three things larger than a lobster. It's the kind you find in books and movies. It’s the stuff of science fiction, and most experts agree that AGI is many decades away from reality.
The big problem in AI
The suggestions that an AI gives you might work, but they’re usually indecipherable to a human. For example, an AI might say “Plant corn on March 12,” and you ask, “Why?” It's impossible to get an answer.
If AI could describe “how” or “why” it came to the conclusion it did, the description wouldn’t be much different from what we would say about our own assessments of information streaming into our brains. We would probably say something like “Well, I thought about it and just concluded what I did—based, I imagine, on all of my prior experiences and this new information available to me.”
Researchers are working on finding out just how AIs make decisions, but in general, it’s hard to discover what an AI’s internal rules actually are.