A Handy Way to Think About Machine Learning

Reading Time: 4 minutes

I often find explanations of machine learning either too complex or overly simplistic. I’ve recently had some luck using a simple frame for explaining it to people in person. Let’s see if I can quickly capture it in this post.

The Jet of Machine Learning

Scratch the surface, and you see that machine learning is basically a kind of ‘statistical thinking.’ We’ve long had tools for doing statistical analysis on data. Machine learning just automates that analysis so we can do it at much larger scale. The basic techniques have been around for decades, but machine learning didn’t really explode in popularity until just a few years ago with the advent of powerful new processors (Graphics Processing Units and later Tensor Processing Units) and large-scale data sets from Internet services like Google Search, Amazon and Facebook.

Andrew Ng makes the analogy that compute power is the jet engine and data is the jet fuel of machine learning. Rather than fly you to Chicago though, this jet builds statistical models. Those models draw on their underlying data to simulate reality, somewhat the way we simulate reality with our own brains. The difference is that these algorithmic models extend that biological brain of ours to do something it’s not really built for: thinking statistically.

Big Data and Models

Before this powerful new jet showed up a few years ago, we still used machine learning to help automate the way we built statistical models. It saved a lot of time and energy over the more labor-intensive statistical techniques we used to use, and that opened up interesting new capabilities, such as analyzing inventory levels in a warehouse, estimating the threat of overfishing from commercial boats, and predicting stocks prices.

These kinds of applications are what is often described as “Big Data,” or data analytics. In this work’s early phases, the models were typically static, a kind of snapshot analysis of the underlying data. Despite that limitation, the techniques proved extremely valuable in making sense of large datasets, which made them extremely popular in large corporations and resulted in a thriving ecosystem of data analytics companies.

Deepening the Automation

It’s worth calling out one of the specific tricks that we now use to automate the way we build these statistical models. It’s called Deep Learning and it is a technique that has taken the machine learning world by storm. The reason Deep Learning is so popular is that it allows developers to automatically build models through exposure to large datasets. These neural networks have multiple layers, much the way that animal brains do. The lower layers of these networks focus on identifying the most basic and specific features of the model, handing off their results to subsequent layers, which in turn handle progressively more complex and holistic interpretations of the data. The below graphic from Nvidia illustrates an example of layers in a deep neural network for identifying cars, starting with rudimentary lines, moving to wheel wells, doors, and other car parts, and finally on to full cars.

Where developers once needed to painstakingly identify these kinds of attributes (called “features”) in advance, now they simply bubble up from repeated exposure to large datasets. A lot of work still goes into designing the right architecture and preparing the training data, of course, but through this automatic generation of features, Deep Learning has revolutionized the way we build simulated models of our world.

Actuators and Inference

But wait, you say, I thought machine learning involved things like Facebook recognizing pictures of my friends or Tesla’s autopilot. Yes, those are more obvious examples, and that’s because, in these cases, we get to interact directly with the machine learning models themselves. What most of us think of as machine learning is thus actually a machine-learning model, hooked up to some form of automation. We run the model and it helps us make sense of new data, like pictures of friends, recommendations on which rice cooker to buy, or how to get your car to automatically screech to a halt when a mother raccoon suddenly sprints out into the road with her adorable little babies. For example.

I owe this insight to two people. The first is Yonatan Zunger, who recently described artificial intelligence as a triad made up of 1) sensors for collecting data; 2) a model for analyzing and interpreting the data; and 3) an actuator for turning the model’s results into some action:

The second person is Michael Copeland, who outlines two types of hardware chips, 1) training chips optimized for building models; and 2) inference chips optimized for using that trained model to analyze new data. Training new models by exposing them to millions of pictures of cats, for example, is processing and data-intensive. Once that model is trained, however, it can be optimized for greater performance and then deployed as a dedicated “cat recognizer” chip in the field.

Summary

In short, you can think of machine learning as a jet engine, fueled by lots of data. Once you’ve used that jet to build a statistical model, you can then “actuate” it, which is to say, put it to work by allowing it to interact with and infer meaning from new data.

The most powerful examples of doing that tend to include various forms of automation that make things simpler for us. The ones we seem to love most are those that provide us with some sort of user interface that allows us to interact with the model. That might mean making it easier for us to find new music on Spotify, find every picture you’ve ever taken of stained glass on Google Photos, or even beat a world champion Go player.

11 comments

  1. I do wonder how these can be connected to legal personhood for AI?

  2. Hi Gideon, did you mean tying AI into corporate law by granting each system citizenship, such as corporate citizenship; or by associating it with its respective development entity? I ask because I realize how many independent developers exist.

    • I was referring to the latter idea, Sharleen, of tying it into the organization that is connected to the development activity. While it is true that there are many independent developers out there, most actual machine learning projects that I’m familiar with are connected to some sort of organization, be it a corporation, an academic institution or some open source network of individuals. If you can think of exceptions though I’d be very interested. In those cases, it may make sense to wrap a kind of Distributed Autonomous Organization legal wrapper around them. The reason I suggest this is that corporate law is so well developed, with a very long history and lots of case law supporting it.

  3. Thank you it is useful for me

  4. Thanks for simplifying these terms

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Sign up here for the latest articles. You can opt out at any time.


Subscribe by email:



Or subscribe by RSS:

Subscribe-by-RSS---Orange-Background