We hear a lot these days about the potential dangers of “AI bias.” If a machine learning system is based upon a data set that is somehow biased by age, gender, race, ethnicity, income, education, geography, or some other factor, the system’s outputs will tend to reflect those biases. As the inner workings of ML systems are often impossible for an outsider to fully understand, any such biases can appear to be hidden, making them seem especially sinister.
But before getting too alarmed, ask yourself this. Over the long run, which decision-making model is likely to be more objective: human or machine reasoning? On balance, I argue for the latter for the simple reason that data and software tend to be much less biased than you and me.
Human bias is pervasive
That human bias is pervasive is undeniable. We know that criminal sentencing varies greatly depending upon the deciding judge; loan and credit approval biases have a long and painful history; teachers at all levels have been shown to grade student papers differently depending upon who they think wrote them; managers tend to hire people who seem similar to themselves; and journalists have vastly different views regarding which issues should be covered. There are many such examples.
Behavioral science has identified so many reasons why humans are biased that the phenomena is said to be over-explained. We rely too much on societal stereotypes; we over-value our own individual experiences, especially our recent ones; we are quick to embrace anecdotes that confirm our existing views; we generally have a poor sense of statistics and probabilities; and we tend to over-rate the reliability of our personal gut feel.
Perhaps most importantly, we perpetuate these biases through our stubborn reluctance to admit mistakes, a trait that seems to get stronger the more educated and successful one becomes. Data analytics are now widely used to confront these well-known tendencies.
More fundamentally, we wouldn’t want to eliminate all human biases even if we could. In many ways, our biases – also known as beliefs, hopes, interests, allegiances, inclinations, opinions, preferences, tastes and values – are what define us as individuals and give society its dynamism. It’s actually terrifying to imagine a bias-free world where everybody uses the same information in essentially the same way; that’s the stuff of dystopian science fiction.
AI and bias: Reducing harm
So human biases are everywhere, but only sometimes problematic. This realization helps us see that the role of technology is to reduce the harmful biases without imposing a culture of undue uniformity. Fortunately, this challenge seems manageable.
From a technical perspective, data scientists and statisticians should assess a system’s underlying data and judge its representativeness in much the same way that they assess the representativeness of a survey sample. Once biases are identified, adjustments can be made, typically without the emotional resistance that comes from directly confronting human prejudices. This means that undesirable machine learning biases will tend to decrease over time.
But in many areas, human expertise will still be essential because making the most objective decision is usually less important than making the most correct one. Just as human/machine cooperation is often the best way to play chess, operate warehouses, or perform surgery, so it is with increasing societal fairness.
Whether we are seeking more equitable criminal sentencing, student grading, or loan/credit approvals, the combination of machine learning and human judgement will likely produce the best results. We already see this pattern in professional sports where the extensive use of analytics has now evolved to include a more balanced mix of human and machine reasoning.
The bottom line is that long-standing human biases have created many societal inequities that well-designed computer systems can significantly reduce. This is why algorithms, analytics, and machine learning will, over time, create much more fairness than harm. Like it or not, software and data are fundamentally much more objective – and much less stubborn – than you and me.