101 Artificial Intelligence and Machine Learning

intrinsical · Nov 15, 2014

In another thread, I was talking about how most people have heard of Artificial Intelligence but have no idea what it actually is other than its some kind of hand-wavy techno-magic that makes computers smart.

Since I have a rather deep interest in the topic of AI (having a Masters in Machine Learning, was a PhD candidate for a while and was teaching assistant to Coursera's Stanford Machine Learning class for 2 years), I volunteered to provide a quick "Dummy's Guide" to Artificial Intelligence. It is a huge topic so I'm going to take it slowly with multiple posts over the next few weeks, but ultimately I hope to be able to answer these questions: What is AI? How does it work?What can it do? What can't it do?

By the way, feel free to ask any question. If I can answer it, I'll try. For now, I'll start by demystifying AI and briefly cover the history of AI.

Demystify Artificial Intelligence
Just like Physicists who occasionally declare to news reporters things like "We're unlocking the mysteries of the universe" or "The God Particle Discovered!", we Artificial Intelligence researchers are as guilty when it comes to glorifying our field of study. Anyone who has made in-depth study on AI know that our computers are neither smart nor intelligent. Really, a computer is just a glorified calculator. Yet, when it comes to reporters we still say inane things like, "Look at how smart and intelligent our AI is".

We may one day actually create an AI that can truly think and reason all by itself. However the truth is that at least for now, computers are still dumb. Remember, it is always a smart human that makes the dumb computer appear to do smart things. One day this might change and computers become truly intelligent. That's the holy grail of AI researchers, but at least for the next decade (or two) I don't believe things will change.

To be continued...

JES · Nov 16, 2014

Perhaps it isn't necessarily a bad thing, having at least another two decades to wait for the actual creation of an artificial intelligence.

Are we really ready, as mankind, to essentially give birth to what is basically an artificial sentient life form? Something that can think for itself?

Robert Maxwell · Nov 17, 2014

Strong AI has been "a decade or two" away since like, the '60s.

No offense, intrinsical! I look forward to seeing more of your posts on this!

intrinsical · Nov 17, 2014

Robert Maxwell said:
Strong AI has been "a decade or two" away since like, the '60s.

No offense, intrinsical! I look forward to seeing more of your posts on this!

I was being gentle when I said a decade or two. In all honesty, there is no way to predict when it will happen as it depends on us discovering or inventing the mechanics of reasoning.

Right now, no one even has an inkling as to how the reasoning and thinking process actually works. Once someone has a clue, then I can tell you with some certainty that we will have a rudimentary thinking machine in under a decade. Because that is when all the mechanisms of science and engineering can kick in and we can experiment and twiddle and refine it into... something.

intrinsical · Nov 17, 2014

The First Era: Logic Driven Artificial Intelligence
When seen from the highest levels, Artificial Intelligence have been successfully created using Logic or Statistics (or a combination of both Logic and Statistics, but this is rarer). In the early years from 1930s to the early 1990s, Logic was the predominant method. This was the era of rule based Expert Systems and best exmplified by chess playing computers like IBM's Deep Blue who beat Kasparov. The process of creating a Logic based expert system is a tedious one. First, human experts on the topic are needed. The experts are interviewed in an attempt to elicit their knowledge and know-how. Next an AI practitioner has to understand the knowledge imparted by the experts and encode them as logic rules. It is a mistake prone process and the slightest error may result in a non-functioning system. As you can imagine, it often take years to go from conception to realization of such a system.

For example, IBM's Deep Blue took 7 years of research and hard work. One of my university professors told me the story of her postgraduate days in the 80s when she was part of a group of computer scientists and linguists whose task was to encode the rules of the English language into an expert system of English. They had expected that it would take roughly a year's time and there should be no more than two to three hundred rules of English. A decade and thousands of rules later, it still isn't anywhere near completion. (Side note: Linguists eventually figured out that each and every word in English has a set of 3 to 20+ rules attached. Since there are over half a million words in English, it means there are potentially millions of rules.)

By the late 1980s, AI practitioners started looking for other ways of creating AI. Faster ways. One of the ideas that stuck was that instead of getting a human to manually key in logic rules, maybe we can get the computer to automatically learn the rules? It is impossible for the computer to induce knowledge directly from a human expert, so instead lets use a different source of knowledge - data.

PurpleBuddha · Nov 20, 2014

intrinsical said:
Robert Maxwell said:

Strong AI has been "a decade or two" away since like, the '60s.

No offense, intrinsical! I look forward to seeing more of your posts on this!

Click to expand...

I was being gentle when I said a decade or two. In all honesty, there is no way to predict when it will happen as it depends on us discovering or inventing the mechanics of reasoning.

Right now, no one even has an inkling as to how the reasoning and thinking process actually works. Once someone has a clue, then I can tell you with some certainty that we will have a rudimentary thinking machine in under a decade. Because that is when all the mechanisms of science and engineering can kick in and we can experiment and twiddle and refine it into... something.

Are we seeing a noticeable rise in Neuroscientists crossing disciplines with AI research or vice versa?

intrinsical · Nov 22, 2014

PurpleBuddha said:
intrinsical said:

Robert Maxwell said:

Strong AI has been "a decade or two" away since like, the '60s.

No offense, intrinsical! I look forward to seeing more of your posts on this!

Click to expand...

I was being gentle when I said a decade or two. In all honesty, there is no way to predict when it will happen as it depends on us discovering or inventing the mechanics of reasoning.

Right now, no one even has an inkling as to how the reasoning and thinking process actually works. Once someone has a clue, then I can tell you with some certainty that we will have a rudimentary thinking machine in under a decade. Because that is when all the mechanisms of science and engineering can kick in and we can experiment and twiddle and refine it into... something.

Click to expand...

Are we seeing a noticeable rise in Neuroscientists crossing disciplines with AI research or vice versa?

Neuroscience is booming field these days thanks to fMRI and other tools that make research on the brain possible. And I know that since the 1970s, neuroscience research do use computers to try and simulate neurons and parts of the brain. Coupled with the fact that AI researchers has in the past decade been taking a serious re-look at neural networks, so I can presume there is a healthy bit of crossing these days.

I also know of a pubescent european research project, called Blue Brain, that wants to simulate the whole brain, but its still very much in its infancy. In fact, my feeling is that the project's chances of success is about as good as us sending a manned space craft out of the solar system using current technology. Still, the project should groom quite a few budding neuroscientists.

Dryson · Nov 23, 2014

Intrinsical - New 3D bioprinter to reproduce human organs, change the face of healthcare: The inside story

Researchers are only steps away from bioprinting tissues and organs to solve a myriad of injuries and illnesses. TechRepublic has the inside story of the new product accelerating the process.

http://www.techrepublic.com/article/...-human-organs/

Taken from the Bio Printing post - If Bio Printing is able to reproduce human organs then the process should also be able to reproduce human skin. Basically how the human bio-printed human skin would function is that the outer layer would resemble our skin where the very bottom layer would have silicate base built with circuitry similar to that of a computer circuit board. On the circuit board would be thin wires, fiber optics or hairs that would record the temperature based on the amount of photons collected by each fiber optic hair. The information would then be analyzed by a primary chip in the body cavity where the information on how cold or hot the temperature was would then be sent to the brain of the AI where it would determine whether or not it should put clothes on or take them off based on a basic input and output logic program built into the AI's brain.

For example lets say that the temperature outside is 45 degrees. The AI steps out of a module without any clothes on and then processes that the temperature is 45 degrees. It would then decide whether to put clothes on or not depending upon how long it would be traveling. Another determination of whether the AI would put clothes on would also be whether or not the AI was traveling on foot or in a vehicle heated to the same temperature as the module that it just left.

A basic test would be to build human skin using Bio Printing and then place the skin on a flat circuit board with a single fiber optic thread to conduct testing with.

Arduino would be a good system to use as the costs involved would be quite less then using a more elaborate circuit board and processing system.

Robert Maxwell · Nov 23, 2014

StarMan · Nov 24, 2014

^Seconded.

Great read, intrinsical.

intrinsical · Nov 25, 2014

My apologies for being a little slow with posting this week, I had to spend a little time trying to find a suitable dataset and generate a couple of images to illustrate some AI principles.

The Modern Era: Data Driven Artificial Intelligence
Since the 1990s, the main way of creating Artificial Intelligence is to get it to automatically learn from data. The sub-field of AI that specializes in learning from data is often called Machine Learning. In general, Machine Learning techniques work by try to find patterns in the data itself. For example by trying to find out if the data clumps together in specific regions, or by figuring out its shape geometry. And yes, that's essentially what modern "Smart" and "Intelligent" Artificial Intelligence is, algorithms that are good at identifying patterns in data.

So what do I mean by finding patterns in the data? I will illustrate with the example from a real world dataset. The dataset I have selected is from the 1990s that the National Institute of Diabetes and Digestive and Kidney Diseases has collected on 768 female Native American patients who were suspected of having diabetes. The dataset includes a few attributes about each patient, including their age, number of pregnancy. The dataset also includes a couple of measurements and tests that were made, such as each person's body mass index, triceps skin fold thickness as well as the results of several glucose and insulin level tests.

For now, lets just focus on two attributes, Age and Body Mass Index. I have plotted these two attributes in a chart. A person's age is on the horizontal axis and their body mass index is on the vertical axis. Blue crosses are people without diabetes and orange cross represent people with diabetes.

There are a couple of things that we can infer from the chart. One is that the datapoints are not evenly distributed throughout the entire chart. This is what I mean when I say the data clumps together, or more technically we call it a clustering of the data. When data clusters together like this, we know that it is not random data (random datapoints will spread out evenly) and therefore it is more likely to be an artifact of some kind of process. Another observation we can make is that the blue and orange clusters are slightly different. You might notice that there are almost no orange points in the lower left corner of the chart. These are healthy young people with low body mass index and as expected, very few of them have diabetes.

Another thing you might notice is that there is quite a bit of overlap between the blue and orange points. This is a natural occurrence in most of the real world data that we use in AI. After all, the real world is always messy and noisy place that don't let us easily put things into clear boxes. This does have some implication on what we can do with our AI, and that is we don't expect the AI to be 100% correct all the time. Its just a nature of the beast. The only time we will get an AI that is 100% accurate is when blue and orange datapoints are in clearly separate clusters. And in such a case, we clearly would not need an AI to tell us who has diabetes and who doesn't.

Of course for this to be an Artificial Intelligence, it needs to figure all that I have said above for itself. And more importantly, it needs to be able to DO something based on the data we have given it. However, that's all I will say for now. In the next post, I will cover in more detail about how we can use this dataset to create an Artificial Intelligence that is capable of deciding if someone has diabetes.

intrinsical · Nov 29, 2014

Machine Learning Task: Classification
One of the most commonly used AI technique these days also happen to be the simplest to understand conceptually. And that technique is the Machine Learning task of Classification. In Classification the goal is to create an AI classifier that when given a set of input data, is able to select a decision out of a small set of possible answers. The set of decisions can be as simple as a binary decision (Yes/No, True/False, Black/White), or it can be something akin to a multiple choice with several possible decisions to pick from. For example an optical character recogniction AI might use the set all possible alphabets and numbers (a, b, c, ... to z, 0, 1, ... 9).

The way to creating an AI classifier is to create a decision boundary. In the case of a simple binary (Yes/No) classifier, the decision boundary is literally a line that separates the "Yes" region(s) from the "No" region(s). Using the diabetes dataset as an example, here's an example of a decision boundary that tries to separate the people with diabetes from the people without diabetes.

Here, I just manually picked a bad line to be the decision boundary just purely as an example. The blue region indicates what it believes are people without diabetes and the orange region are people with diabetes. Roughly, the decision boundary rule I used was that anyone with a Body Mass Index of 23.5 does not have diabetes and anyone above 23.5 BMI has diabetes.

So imagine a slightly obese patient comes into the clinic. The nurse asks the patient's age (20 years old) and takes a reading of his Body Mass Index (40). The nurse keys in this info into the computer and since 40 BMI is much greater than the decision boundary of 23.5 BMI, the AI reports that the patient is diabetic. *facepalms* As you can see, it isn't doing a good job of classifying diabetic people.

So lets try a very simple AI that tries to fit a straight line to the diabetes dataset. In this case, we have two X variables (BMI and Age), so instead of the usual (m * x) + c = 0, we have (m1 * x1) + (m2 * x2) + c = 0. I hope you can see that this is still the formula for a straight line. Anyway, after fitting this formula to the diabetes dataset, it gave me this as the decision boundary (-0.0983 * BMI) + (-0.0456 * Age) + 5.4038 = 0. Visually, this is the decision boundary:

With this decision boundary, it correctly classified 523 out of 768 datapoints. In terms of percentage, it is correct 68.1% of the time.

There is no rule that the decision has to be a straight line, so lets see what happens when I try a different AI algorithm that generates a more complicated decision boundary. I applied a decision tree algorithm on the diabetes dataset and after some number crunching, gave me the following small set of rules:

1) If BMI <= 27.8, then Diabetes = FALSE
2) If BMI > 27.8 AND Age > 30, then Diabetes = TRUE
3) If BMI > 27.8 AND Age <= 30 AND BMI <= 41.5, then Diabetes = FALSE
4) If BMI > 27.8 AND Age <= 30 AND BMI > 41.5, then Diabetes = TRUE

Visually, the rules correspond to the following decision boundary:

It is a slightly more complicated decision boundary, but it gives a correct diagnosis 68.5% of the time. Just 0.4% better than the straight line AI. So how far can we push the accuracy? I tried several other Classification algorithms, including some cutting edge Classification algorithms. Unfortunately, the decision boundaries were too complicated for me to plot out but I can report that the best accuracy I managed to obtain was 69.4% (That's 533 correct classifications vs straight line AI's 513). When compared with the straight line AI, it is just 1.1% better.

Are you surprised by this? Why do you think this is so?

Robert Maxwell · Nov 30, 2014

What you gained by raising the leftmost boundary, you lost by lowering the rest, so it was almost a wash.

I'm going to guess a curve would be much more appropriate for this problem?

intrinsical · Nov 30, 2014

Robert Maxwell said:
What you gained by raising the leftmost boundary, you lost by lowering the rest, so it was almost a wash.

I'm going to guess a curve would be much more appropriate for this problem?

The shape of the decision boundary (although more formally, we refer to them as models and not shapes) certainly matters, but not as much as you would think. In this case a curved line model would help, but only just by a tiny bit. Using a curve I managed to train a classifier with an accuracy of 69.7%. That's merely two better than the previous best classifier.

In fact its been shown repeatedly that in most cases, replacing a bad model with a good model often results no more than a 5% to 10% improvement in accuracy. As you've observed, changing models often results in gaining improvement in one region but loosing out in another region.

The main issue lies not in the model we use, but rather in the input data itself. You can think of it as garbage in garbage out if you like. Looking back at the diabetes dataset, you can see there's such a large amount of overlapping between non-diabetics and diabetics that no model is going to fare any better than the rest.

The trick then, is to change the data. For example, instead of just trying to predict who as diabetes using just the patient's Age and Body Mass Index, we can add in the data of insulin and glucose tests. Doing that, I managed to get a jump to 78% accuracy.

Robert Maxwell · Nov 30, 2014

Ah, so what you're saying is that the sorting accuracy varies depending on whether you are using variables you shouldn't or are missing variables you should include. That makes sense.

intrinsical · Dec 5, 2014

Solstice said:
Ah, so what you're saying is that the sorting accuracy varies depending on whether you are using variables you shouldn't or are missing variables you should include. That makes sense.

As vulcans often say... Indeed.

In fact, I would say that close to 80-90% of my work in AI is spent obtaining data, analyzing data, cleaning data and transforming data into a shape that is easy to classify.

Since the majority of the AI these days involves the use of this type of decision boundary finding algorithms, my next post will go a little more in-depth into the various types of such algorithms. The algorithms I will cover includes Logistic Regression, Decision Trees, Neural Networks, Support Vector Machines and a few "ensemble" algorithms. Each algorithm employs a different method of finding decision boundaries, but the end result is the same - the creation of some kind of decision boundary (or multiple boundaries) that separate the dataset into different regions.

intrinsical · Dec 9, 2014

Logistic Regression http://en.wikipedia.org/wiki/Logistic_regression
This is the formal name for the straight line decision boundary learning algorithm. It is important for being one of the earliest Classification algorithms. It is also sometimes referred to as a Perceptron and is a basic building block of Neural Networks (more on this later).

Up to now, I have presented Logistic Regression as a simple straight line decision boundary with a sharp edge. One side of the decision boundary has the value of 0 and the other side of the boundary has the value of 1. Plotted in 3D, the decision boundary would look something like this. Think of the x-axis as Age and the y-axis as Body Mass Index. The z-axis is the decision of whether the patient is non-diabetic (value of 0) or diabetic (value of 1).

However when a datapoint lies very close to the decision boundary, it is likely that we may have classified it wrongly. In order to represent this uncertainty, we can use a number between 0 and 1. For example, for points that lie close to the decision boundary we can use a value like 0.6 or 0.4. For points far from the decision boundary, we can use values like 0.9999 or 0.1. This creates a smoothed decision surface, as seen below. This is actually what the Logistic Regression decision boundary actually looks like.

intrinsical · Dec 9, 2014

Neural Networks http://en.wikipedia.org/wiki/Artificial_neural_network
Neural Networks may sound very mysterious, but the idea behind Neural Networks is actually very simple. Neural Network is just a classification algorithm. And like all classification algorithms, it creates a decision boundary that separates datapoints in a dataset into different regions.

We have seen the straight line classifier, Logistic Regression, is only able to create a decision boundary using a single straight line. A Neural Network is able to create a more complicated decision boundary by summing together two or more straight lines.

For example with two straight lines, we can create decision boundaries like these below.

Just using two straight lines, it is possible to create decision boundaries that look like a ridge or a valley.

With four straight lines, we can create even more complicated decision boundaries.

A Neural Network in its simplest form is a classification algorithm that uses multiple straight lines to create a decision boundary. To complicate things, we AI practitioners refer to each straight line in the Neural Network as a Neuron. So a Neural Network with 4 neurons creates a decision boundary with 4 straight lines summed together. A 10-Neuron Neural Network creates a decision boundary from 10 straight lines, so on and so forth.

Granted, I have simplified things and left out some stuff that may be considered important, but this is essentially what a Neural Network is. Have I blown your brains out with the sheer complexity of Neural Networks?

Robert Maxwell · Dec 10, 2014

This is the stuff I've been waiting for. Keep it coming! :techman:

intrinsical · Dec 12, 2014

What I posted above is a drastically simplified description of Neural Networks from the 1980s. It isn't wrong, but it isn't the complete picture either. Right now I am wondering if I should reveal more by posting a part 2, or just continue with the rest of the material. What do you guys want? A little more in-depth on Neural Networks? Or continue on with more advanced machine learning classifiers and examples of how AI applications like face recognition works?

101 Artificial Intelligence and Machine Learning

Commodore

Fleet Captain

memelord

Commodore

Commodore

Rear Admiral

Commodore

Commodore

memelord

Vice Admiral

Commodore

Commodore

memelord

Commodore

memelord

Commodore

Commodore

Commodore

memelord

Commodore

Similar threads

We value your privacy