The further to the left or the right you move, the more your lens on life distorts.

Tuesday, December 20, 2016

Machine Learning

In the tradition of the classic book from another era, The Soul of a New Machine, Gideon Lewis-Kraus presents an outstanding summary of how Google used evolving artificial intelligence (A.I.) techniques to show extraordinary improvements in the Google Translate app. If you're technically minded, it's worth a read. He writes:
A rarefied department within the company, Google Brain, was founded five years ago on this very principle: that artificial “neural networks” that acquaint themselves with the world via trial and error, as toddlers do, might in turn develop something like human flexibility. This notion is not new — a version of it dates to the earliest stages of modern computing, in the 1940s — but for much of its history most computer scientists saw it as vaguely disreputable, even mystical. Since 2011, though, Google Brain has demonstrated that this approach to artificial intelligence could solve many problems that confounded decades of conventional efforts. Speech recognition didn’t work very well until Brain undertook an effort to revamp it; the application of machine learning made its performance on Google’s mobile platform, Android, almost as good as human transcription. The same was true of image recognition. Less than a year ago, Brain for the first time commenced with the gut renovation of an entire consumer product ...

Translate made its debut in 2006 and since then has become one of Google’s most reliable and popular assets; it serves more than 500 million monthly users in need of 140 billion words per day in a different language. It exists not only as its own stand-alone app but also as an integrated feature within Gmail, Chrome and many other Google offerings, where we take it as a push-button given — a frictionless, natural part of our digital commerce.
Lewis-Kraus provides a detailed view of the people, the product, and the technical process that was used to apply multi-layer artificial neural nets to the natural language translation problem. But far more important, he provides insight into how machine learning is no longer a computer science backwater, but may soon become the dominant computing paradigm for the remainder of this century. He writes:
... computers would learn from the ground up (from data) rather than from the top down (from rules). This notion dates to the early 1940s, when it occurred to researchers that the best model for flexible automated intelligence was the brain itself. A brain, after all, is just a bunch of widgets, called neurons, that either pass along an electrical charge to their neighbors or don’t. What’s important are less the individual neurons themselves than the manifold connections among them. This structure, in its simplicity, has afforded the brain a wealth of adaptive advantages. The brain can operate in circumstances in which information is poor or missing; it can withstand significant damage without total loss of control; it can store a huge amount of knowledge in a very efficient way; it can isolate distinct patterns but retain the messiness necessary to handle ambiguity.
Sentient beings have the unique ability to recognize patterns, even when data are fuzzy, or incomplete or skewed in some other way. Machine learning is all about pattern recognition and as such, it may become a major element in what we call artificial general intelligence (AGI). We all profile (using established patterns of data to make judgements/decisions) in almost all things—that's why cries of "profiling" are often misdirected, unless, of course there's bias in the data set we've used to establish the profiling approach. More on A.I. bias in a moment.

Over the years, I have posted a number of comments (e.g., here and here) on the benefits and threats posed by A.I. That's probably because my dissertation used rudimentary A.I. techniques, and I've had an intense interest in the subject for many, many years. I began as a utopian, believing that A.I. would be a wonder-technology. But I've slowly evolved into a distopian, recognizing that A.I. will displace tens, if not hundreds of millions of blue and white-collar workers, not to mention threats to humans for many other reasons. I am not, however, a Luddite, and further recognize that A.I. will be here sooner than we think and will provide enormous benefits. We must put safeguards into place so that it integrates with our society in only moderately disruptive ways.

Because machine intelligence is a learning activity, much of what A.I. is will be based on the massive data sets from which it learns. Kristian Hammond writes:
We tend to think of machines, in particular smart machines, as somehow cold, calculating and unbiased. We believe that self-driving cars will have no preference during life or death decisions between the driver and a random pedestrian. We trust that smart systems performing credit assessments will ignore everything except the genuinely impactful metrics, such as income and FICO scores. And we understand that learning systems will always converge on ground truth because unbiased algorithms drive them.

For some of us, this is a bug: Machines should not be empathetic outside of their rigid point of view. For others, it is a feature: They should be freed of human bias. But in the middle, there is the view they will be objective.

Of course, nothing could be further from the truth. The reality is that not only are very few intelligent systems genuinely unbiased, but there are multiple sources for bias. These sources include the data we use to train systems, our interactions with them in the “wild,” emergent bias, similarity bias and the bias of conflicting goals. Most of these sources go unnoticed. But as we build and deploy intelligent systems, it is vital to understand them so we can design with awareness and hopefully avoid potential problems.
It is, of course, nearly impossible to eliminate the subtle bias that creeps into any machine learning system based on the big data it uses. As long as the bias is unintentional, we can use such systems without major difficulty.

What worries me is that some of these A.I. systems will access big data that has an intentional bias. As a consequence, the patterns that the A.I. recognizes will be skewed, the recommendations/judgements that it makes will be slanted for political purposes, and as our reliance on such systems grow (and it surely will grow), the damage that the A.I. does will begin to become dangerous.

It is highly likely that machine learning systems will be used to assess the success and/or failure of many societal systems—healthcare, education, social services, human impact on the environment, the justice system, and many others. It is critically important for these A.I.s to have access to all data, not just the data that is deemed politically correct or data that support the party in power at the moment.