Discussions about big data’s role in our society tends to focus on algorithms, but the algorithms for handling giant data sets are all well understood and work well. The real issue isn’t algorithms, it’s models. Models are what you get when you feed data to an algorithm and ask it to make predictions. As O’Neil puts it, “Models are opinions embedded in mathematics.”
Other critical data scientists, like Patrick Ball from the Human Rights Data Analysis Group have located their critique in the same place. As Patrick once explained to me, you can train an algorithm to predict someone’s height from their weight, but if your whole training set comes from a grade three class, and anyone who’s self-conscious about their weight is allowed to skip the exercise, your model will predict that most people are about four feet tall. The problem isn’t the algorithm, it’s the training data and the lack of correction when the model produces erroneous conclusions.
Like Ball, O’Neil is enthusiastic about the power of data-driven modelling to be a force for good in the world, and like Ball, she despairs at the way that sloppy statistical work can produce gigantic profits for a few companies at the expense of millions of people — all with the veneer of mathematical objectivity.
O’Neil calls these harmful models “Weapons of Math Destruction,” and not all fault models qualify. For a model to be a WMD, it must be opaque to its subjects, harmful to their interests, and grow exponentially to run at huge scale.
These WMDs are now everywhere. The sleazy for-profit educational system has figured out how to use models to identify desperate people and sucker them into signing up for expensive, useless “educations” that are paid for with punitive student loans, backed by the federal government. That’s how the University of Phoenix can be so profitable, even after spending upwards of $1B/year on marketing. They’ve built a WMD that brings students in at a steady clip despite the fact that they spend $2,225/student in marketing and only $892/student on instruction. Meanwhile, the high-efficacy, low-cost community colleges are all but invisible in the glare and roar of the University of Phoenix’s marketing blitzkreig.
One highly visible characteristic of WMDs is their lack of feedback and tuning. In sports, teams use detailed statistical models to predict which athletes they should bid on, and to deploy those athletes when squaring off against opposing teams. But after the predicted event has occurred, the teams update their models to account for their failings. If you pass on a basketball player who goes to glory for a rival team, you update your model to help you do better in the next draft.
Compare this with the WMDs used against us in everyday life. The largest employers in America use commercial services to run their incoming resumes against a model of a “successful” worker. These models hold your employment future in their hands. If one rejects you and you go on to do brilliant work somewhere else, that fact is never used to refine the model. Everyone loses: job-seekers are arbitrarily excluded from employment, and employers miss out on great hires. Only the WMD merchants in the middle make out like bandits.
It’s worth asking how we got here. Many forms of WMD were deployed as an answer to institutional bias — in criminal sentencing, in school grading, in university admissions, in hiring and lending. The models are supposed to be race- and gender-blind, blind to privilege and connections.
But all too often, the models are trained with the biased data. The picture of a future successful Ivy League student or loan repayer is painted using data-points from the admittedly biased history of the institutions. All the Harvard grads or dutiful mortgage payers are fed to the algorithm, which dutifully predicts that tomorrow’s Harvard alums and prime loan recipients will look just like yesterday’s — but now the bias gets the credibility of seeming objectivity.
This training problem is well known in stats, but largely ignored by WMD dealers. Companies that run their own Big Data initiatives, by contrast, are much more careful about refining their models. Amazon carefully tracks those customers who abandon their shopping carts, or who stop shopping after a couple of purchases. Their interested in knowing everything they can about “recidivism” among shoppers, and they combine statistical modelling with anthropology — seeking out and talking to their subjects — to improve their system.
The contrast with automated sentencing software — now widely used in the US judicial system, and spreading rapidly around the world — could not be more stark. Like Amazon’s data scientists, the companies that sell sentencing apps are trying to predict recidivism, and their predictions can send one person to prison for decades and let another go free.
These brokers are training their model on the corrupted data of the past. They look at the racialized sentencing outcomes of the past — the outcomes that sent young black men to prison for years for minor crack possession, while letting rich white men walk away from cocaine possession charges — and conclude that people from poor neighborhoods, whose family members and friends have had run-ins with the law, and “predict” that this person will reoffend, and recommend long sentences to keep them away from society.
Unlike Amazon, these companies aren’t looking to see whether longer sentences cause recidivism (by causing emotional damage and social isolation) and how prison beatings, solitary confinement and prison rape are related to the phenomenon. If the prison system was run like Amazon — that is, with a commitment to reducing reoffending, rather than enriching justice-system contractors and satisfying revenge-hungry bigots in the electorate — it would probably look like a Nordic prison: humane, sparsely populated, and oriented toward rehabilitation, addiction treatment, job training, and psychological counselling.
WMDs have transformed education for teachers and students. In the 1980s, the Reagan administration seized on a report called A Nation at Risk, which claimed that the US was on the verge of collapse due to its falling SAT scores. This was the starter-pistol for an all-out assault on teachers and public education, which continues to this day.
The most visible expression of this is the “value added” assessment of teachers, which uses a battery of standardized tests to assess teachers’ performance from year to year. The statistical basis for these assessments is laughable (statistics work on big numbers, not classes of 25 kids — assessments can swing 90% from one year to the next, making them no better than random number generators). Teachers — good teachers, committed teachers — lose their jobs over these tests.
Students, meanwhile, are taken away from real learning in order to take more and more tests, and those tests — which are supposed to measure “aptitude” and thus shouldn’t be amenable to expensive preparatory services — determine their whole futures.
The Nation at Risk report that started it all turned out to be bullshit, by the way — grounded in another laughable statistical error. Sandia Labs later audited the findings from the report and found that the researchers had failed to account for the ballooning number of students who were taking the SATs, bringing down the average score.
In other words: SATs were falling because more American kids were confident enough to try to go to college: the educational system was working so well that young people who would never have taken an SAT were taking it, and the larger pool of test-takers was bringing the average score down.
WMDs turn the whole of human life into a game of Search Engine Optimization. With SEO, merchants hire companies who claim to have reverse-engineered Google’s opaque model and whose advice will move your URL further up in its ranking.
When you pay someone thousands of dollars to prep your kid for the SATs, or to improve your ranking with the “e-score” providers that determine your creditworthiness, jobworthiness, or mortgageworthiness, you’re recreating SEO, but for everything. It’s a grim picture of the future: WMD makers and SEO experts locked in an endless arms-race to tweak their models to game one another, and all the rest of us being subjected to automated caprice or paying ransom to escape it (for now). In that future, we’re all the product, not the customer (much less the citizen).
O’Neil’s work is so important because she believes in data science. Algorithms can and will be used to locate people in difficulty: teachers with hard challenges, people in financial distress, people who are struggling in their jobs, students who need educational attention. It’s up to us whether we use that information to exclude and further victimize those people, or help them with additional resources
Credit bureaux, e-scorers, and other entities that model us create externalities in the form of false positives — from no-fly lists to credit-score errors to job score errors that cost us our careers. These errors cost them nothing to make, and something to fix — and they’re incredibly expensive to us. Like all negative externalities, the cost of cleaning them up (rehabilitating your job, finding a new home, serving a longer prison sentence, etc) is much higher than the savings to the firms, but we bear the costs and they reap the savings.
It’s E Pluribus Unum reversed: models make many out of one, pigeonholing each of us as members of groups about whom generalizations — often punitive ones (such as variable pricing) can be made.
Modelling won’t go away: as a tool for guiding caring and helpful remedial systems, models are amazing. As a tool for punishing and disenfranchising, they are a nightmare. The choice is ours to make. O’Neil’s book is a vital crash-course in the specialized kind of statistical knowledge we all need to interrogate the systems around us and demand better.
Weapons of Math Destruction [Cathy O’Neil/Crown]