Algorithmic bias is a question of filtration.
All living things filter information. Without filtration, information overload sets in, and an organism becomes less effective at responding to its surroundings. Allow in too many unhelpful signals or keep out too many helpful signals, and it eventually puts its survival at stake.
The same is true for organizations.
Machine Learning as Organizational Perception
Machine learning plays an important role in Automated Decision Support systems. These technologies help organizations filter the massive flows of data that permeate their internal and external realities. Machine learning systems organize, classify, and interpret the sensory data of organizations. In this sense, they are the perceptual interface of organizations.
Another way of seeing these systems is as a kind of ‘organizational membrane’ for connecting the organization with its environment. Membranes distinguish what is inside the organization from what is outside it, but they aren’t just dumb walls. Membranes are permeable in ways that allow information and resources in and out of the organization. I’ve argued in the past that much of the intelligence of organizational membranes actually rests in people. While that this is still very true, we are now seeing machine learning systems rapidly take over more and more of the role of the perceptual interface for organizations.
Algorithmic Bias in Organizations
What happens when algorithmic bias causes machine learning systems to incorrectly perceive reality? In the simplest of terms, it distorts the organization’s perceptual filters in ways that increasingly put its survival at risk over time.
How is it that these perceptual distortions arise in the first place? To understand that, it helps to see that most algorithmic bias is really just an amplification of underlying human bias. As wonderful as it may be, algorithmic iteration is not magic. “Garbage in, garbage out” and “bias in, bias out.”
This amplification of organizational bias by computer algorithms is not as recent a phenomenon as you might assume. In 1986, a software program for screening applicants for admission to St. George University Hospital in London was shown to discriminate against women and people of color. As a later write-up of the case pointed out: “the program was not introducing new bias but merely reflecting that already in the system.” Like modern machine learning systems, that program was crafted to reflect existing admissions decisions.
The difference back then was that its algorithms were tediously hand-coded. Today, machine learning systems automate that algorithmic tweaking through exposure to massive sets of training data. When that data contains human bias, as it quite often does, the feedback loops of the learning system can dramatically amplify it — sometimes with tragically hurtful and embarrassing consequences.
So, to catch that bias before we embed into these powerful systems, we need to first understand the nature of human bias. One of the first steps is knowing whether the people a system’s design actually know they are introducing bias.
Conscious Algorithmic Bias
To have a conscious bias is to knowingly maintain a perspective without subjecting it to evidence. We tend not to use the word “bias” when the needs of business class customers supersede those of economy class customers. That is because we can rationalize the different treatment with clear evidence that one group paid more than another. As a society where we draw a line is in not allowing organizations to treat people differently based on their race, gender identification, religion, sexual preference or other methods of discrimination.
One of the scarier prospects we face today is organizations coupling deliberate discriminatory bias with the power of machine learning. The 2016 US presidential election highlighted how dangerous this combination can be. As a society, we are ill-prepared for the hard work that lies before us in addressing this challenge.
Unconscious Algorithmic Bias
There are many ways to think about unconscious bias. I like the way that Nobel Prize-winner, Daniel Kahneman talks about it. In his book, Thinking: Fast and Slow, Kahneman describes two systems in the brain: one intuitive, fast and sloppy, the other rational, slow and precise. What we see when we dig into these systems is that because the first one requires less energy, it is the one that most of us default to when we make decisions. It is also this system that most often leads to cognitive distortions in the way we see the world.
I find this way of understanding unconscious organizational bias useful because it shows just how prone we are as humans to experiencing these kinds cognitive distortions. That is why it is so easy for these forms of cognitive bias to slip into the data that we use to train our machine learning systems — and cloud the perceptual interfaces of our organizations.
The other useful thing about Kahneman’s framing is the antidote he proposes for cognitive bias, which is statistical thinking. It’s no wonder that the Harvard Business Review predicts that data scientist will be the “sexiest” job of the 21st Century.
“The sexy job in the next 10 years will be statisticians. People think I’m joking, but who would’ve guessed that computer engineers would’ve been the sexy job of the 1990s?” — Hal Varian, the chief economist at Google
Addressing Algorithmic Bias
One of the more interesting frameworks I’ve run across for thinking about algorithmic bias is a paper by Michael Skirpan and Micha Gorelick, titled “The Authority of “Fair” in Machine Learning.” In it, the authors frame bias in terms of fairness by asking three basic questions: 1) is it fair to make a particular machine learning system?; 2) is that system’s technical approach fair?; and 3) does the system create fair results?
Question one is a question of what should we build and why should we do it. This is not a technical question, but a management question that cuts to the system’s purpose and asks whether the world will be more fair or less fair as a result of solving this particular problem.
Questions two and three are more technical and require the expertise of those sexy data scientists. At a high level, unconscious bias occurs when the sample data used to train algorithms doesn’t accurately reflect the larger pool of data the algorithm will face when it is put into actual use. Question two tries to anticipate these distortions in advance by tuning into the way the data is collected, unanticipated correlations within the data, problems with the way humans label the data. Question three is similar to quality assurance processes, with a system’s outputs being evaluated against various performance targets — in this case, related to fairness.
Stakeholders Can Help Address Bias
Framing organizational bias in terms of fairness leads to the question of whose interests are being served by an organization, which gets us to a stakeholder-centric view or organizations. In fact, I’m going to go out on a limb and say that bias — both conscious and unconscious — is the tendency of management to prioritize one set of organizational stakeholders over others.
Amazon puts customers at the center of all their decisions, even if it sometimes means disregarding the interests of suppliers and other partners. Uber does the same relative to its drivers. These are examples of conscious bias, and whether you like them or not has a lot to do with what kind of stakeholder you are to these firms. The biggest conscious bias we face in large corporations today, by the way, is the way that shareholder interests are favored above all others.
The questions of fairness outlined above are greatly enhanced when they deliberately and consciously incorporate a more holistic understanding of the full range of stakeholders in an organization. Deciding to launch a new Uber algorithm for funneling more rides to higher rated drivers is great if you are an end-user. But the feature might actually be more robust, and the overall service strengthened by taking into account the unforeseen impacts on a new driver who happened to go through a short rough patch on her first few days of driving.
Data collection efforts and systemic evaluation could similarly be improved by taking a broader stakeholder perspective since these are the people who ultimately contribute to the long-term success of the organization.
With Great Power…
We have already seen that data scientists are a hot commodity these days, and it’s no wonder. These individuals aren’t just the key to building today’s most exciting and powerful technologies. As we’ve noted, data scientist define the perceptual clarity of an organization. They do this by ensuring that its machine learning systems accurately reflect the organization’s underlying reality. In this fundamental sense, the question of bias is not simply something that organizations ‘ought’ to address; it cuts to the very heart of their ability to function in the world.
With great power comes great responsibility, and data scientists have an opportunity to stretch our understanding of bias beyond the mere technical sense, to include notions of fairness. Including the full range of stakeholders into the design and training of our machine learning systems, helps ensure that these algorithms more accurately reflect the long-term interests of the organization.