Predictive Policing Is Not as Predictive As You Think
David O’Connor is an intern for the Digital and Cyberspace Policy Program at the Council on Foreign Relations.
The problem of policing has always been that it’s after-the-fact. If law enforcement officers could be at the right place at the right time, crime could be prevented, lives could be saved, and society would surely be safer. In recent years, predictive policing technology has been touted as just such a panacea. References to Minority Report are apparently obligatory when writing about the topic, but disguise a critical problem: predictive policing isn’t sci-fi; it’s a more elaborate version of existing, flawed practices.
Predictive policing is an umbrella term to describe law enforcement’s use of new big data and machine learning tools. There are two types of tools: person-based and location-based systems. Person-based systems like Chicago’s Strategic Subject List use a variety of risk factors, including social media analysis, to identify likely offenders. A 2015 report stated the Chicago Police Department had assembled a list of “roughly 400 individuals identified by certain factors as likely to be involved in violent crime.” This raises a host of civil liberties questions about what degree of latitude police should be granted to perform risk analysis on people with no criminal records. In the future, these questions will become ever more pressing as revelations of threat scores, StingRays and facial recognition technology continue to grab headlines.
In the present, however, the majority of publicly-known predictive policing algorithms are location-based. Twenty of the nation’s fifty largest police departments are known to use such algorithms, all of which rely on historical crime data—things like 911 calls and police reports. Based on data trends, these algorithms direct police to locations that are likely to experience crime at a particular time. Unfortunately, the Department of Justice has estimated that less than half of violent crimes and even fewer household property crimes are reported to the police. An algorithm trying to make predictions based on historical data isn’t actually looking at crime; it’s looking at how police respond to crimes they know about.
This merely reinforces the biases of existing policing practices. In October, the Human Rights Data Analysis Group released a study that applied a predictive policing algorithm to the Oakland Police Department’s drug crime records from 2010. The study found that the algorithm would dispatch officers “almost exclusively to lower income, minority neighborhoods”—despite the fact that drug users are estimated to be widely dispersed throughout the city. The predictive algorithm essentially sent cops to areas they had already made arrests, not identifying new areas where drugs might appear.
The algorithm the researchers analyzed was written by PredPol, one of the largest suppliers of predictive policing systems in the United States, and was chosen for being one of the few algorithms openly published in a scientific journal. PredPol says it uses “only three data points in making predictions: past type of crime, place of crime and time of crime. It uses no personal information about individuals or groups of individuals, eliminating any personal liberties and profiling concerns.” Ironically, these parsimonious standards ensure that the algorithm cannot improve on the historical record; it can only reinforce it.
Some systems, like IBM’s, wisely incorporate other data points like weather and proximity of liquor stores. Unlike PredPol, however, the vast majority of these algorithms are trade secrets and not subject to independent review. The secrecy around the software makes it harder for police departments and local governments to make fully informed decisions. It also bars the public from participating in the decision-making process and sows distrust.
That’s not to say that police departments shouldn’t use software to analyze their data. In fact, a 2015 study found predictive policing technology had significantly aided law enforcement in Los Angeles and Kent, England. In Norcross, Georgia, police claim that they saw a 15 percent reduction in robberies and burglaries within four months of deploying PredPol. The Atlanta Police Department was similarly enthused.
Further development of the technology is inevitable, so local governments and police departments should develop appropriate standards and practices. For starters, these algorithms should not be called ‘predictive.’ They aren’t crystal balls; they’re making forecasts based on limited data. Less obvious data points, like broken streetlights or the presence of trees, should be incorporated to refine these forecasts. As in St. Louis, they shouldn’t be used for minor crimes. Person-based algorithmic forecasts should never be accepted as meeting the reasonable suspicion requirement for detaining an individual, and only data specialists should have access to the software to reduce the chances of abuse. Police should by default incorporate racial impact assessments into their data and programs should be open to academic review. The potential for mission creep is enormous; reviews must be regular to ensure that software is being used appropriately. Most importantly, city governments and police departments should conduct a transparent dialogue with the public about what data is being collected, particularly in the cases of cell phone surveillance and social media analysis, and citizens should be able to see what data has been compiled on them, be it photos or a threat score or biometrics.
It’s a natural choice for police departments with shrinking budgets and huge troves of data to turn to machines for help. However, officials must recognize that nerds won’t solve the problem of policing; data must be a supplement to traditional, people-focused police work. Data-driven predictions have suffered many prominent setbacks in 2016. We should be cautious about using them to arrest people.