Technological advances can improve law enforcement agencies’ investigative methods, but investigators often struggle to decode the massive wall of data. Here’s where digital tools and techniques can help.
Captain Johnson has a problem. As the head of investigations for her agency, she and her personnel are charged with both drug and death investigations. The past few months have witnessed activity on both fronts: an alarming rise in drug overdose deaths from opioids. Her team faces questions about the source of the drugs, with inconclusive toxicology results from the medical examiner.
Explore the Government and public services collection
Read more from the Analytics collection
Subscribe to receive related content from Deloitte Insights
Where can she begin? First, Captain Johnson asks her team to determine which criminal actors or organized criminal groups are involved in the diversion, distribution, and trafficking of similar opiates in the region. Her narcotics team finds a number of leads but, in fact, have too much data to analyze—from prescription drug monitoring program (PDMP) data and public health records to financial records and social media. Homing in on the exact threat is like finding a needle in a haystack. Many days and many cups of coffee later, the investigators are overwhelmed and about to give up. But every day the problem grows, and Captain Johnson needs results.
Individuals and organizations involved in criminal and illicit activity are becoming increasingly sophisticated. Bad actors are harnessing the power of new technologies as quickly as they’re being invented. The good news is that technological advances can also improve the investigative methods of law enforcement agencies. In the words of one homicide supervisor, “In 1993, there were no social media or surveillance cameras. We just had blood typing that was used to help identify people from a crime scene. Now DNA is used, which can identify an individual using his DNA to one in a billion, quadrillion, or greater.”1
These benefits bring their own challenges. The digital world presents investigators with a massive, seemingly impenetrable wall of data. For example, in one year alone, a single FBI investigation collected six petabytes of data—the equivalent of more than 120 million filing cabinets filled with paper.2 When targeting illicit activity, law enforcement and intelligence agencies must not only grapple with a wide variety of new and unfamiliar data sources, but also need to make better use of the data they already collect. Without effective data analysis, law enforcement agencies will struggle to counter the criminal actors they are charged with targeting.
This kind of data analysis is no easy task for law enforcement agencies, with their resources mostly dedicated to core law enforcement functions of keeping communities safe. Most law enforcement personnel don’t have training in data science and digital research, which are generally required to perform advanced analytics. To get access to data-driven insights, many police departments have turned to data science volunteers from local universities for help, and even amateur genetics enthusiasts.3 Resource constraints, however, can limit an agency’s ability to bring outside experts in on a consistent basis, possibly resulting in a daily struggle to find relevant information for investigations, even though data is now everywhere.
But new investigative tools and techniques can help law enforcement agencies overcome resource constraints and come to grips with massive digital data. Artificial intelligence (AI), open source data management tools, predictive analytics solutions, and social media exploitation capabilities are helping many investigators and operators make sense of mountains of data. New tools can help make previously unseen connections between information and identify where key information is lacking, while new sources can provide access to reams of data, and new partnerships can help exploit the data. But most of all, these practices can give investigators more of what they really need: time. The FBI’s Counterterrorism Division’s efforts to streamline its data systems has enabled a 98 percent reduction in manual work for analysts and a 70 percent cost reduction.4 Leveraging these tools and approaches can help investigators return to where they belong: not writing code or searching through databases, but rather tracking criminals and keeping our communities safe.
Captain Johnson and her team’s data problem is a common one. Historically, investigative analysis and targeting have often been guided by what information an agency has access to. Just as with a hammer every problem looks like a nail, often every investigation begins with whatever data is most readily available. But there may be critical data sources that are missing, too hard to access, or too complicated to analyze, leading to blind spots in an investigation.
To simplify the investigator’s data problem, there are three fundamental questions every organization should answer: 1) What information do we have? 2) How do we make sense of it all? And 3) What key pieces of information are we missing?
Ironically, the first step in the solution to being overwhelmed with data is often to create an even larger pool of data. That is because many organizations have huge volumes of data but are unable to use it effectively, due to computing and integration challenges. Outdated and insufficient computing power and platforms hinder advanced analysis. Silos, both within and outside the organization, inhibit meaningful access to integrated data that may help an investigation. Breaking down these walls is typically a key first step. For one police department in the United Kingdom, that meant bringing together intelligence data sets, command and control systems, and operational data streams, to give officers a unified picture on a single screen.5
Then this larger, fuller pool of data can be sorted and managed by a small team of professionals. By bringing data sources together, many investigators can be supported by a team of data scientists and analysts that can help them make sense of the data that they do have through “exploratory data analysis,” a standard process by which data sets are measured for breadth, depth, and quality (explored later in this article). Important data sources include in-house data maintained by law enforcement and intelligence agencies; commercial data sources; and open sources such as social media activity, property records, criminal histories, professional licenses, medical databases, and a host of other sources.
Once an organization fully understands the breadth of data it currently has access to, it can begin pondering the next question: What key pieces of information are we missing? Starting with a target- or problem-centric approach can help investigators create a holistic picture of persons, places, and objects of interest from a large pool of data. That holistic picture, in turn, can begin to show where key holes in the analysis lie. Perhaps there is no information on how two known associates communicate, or where a suspect receives his mail. Knowing these gaps can help guide future collection and surveillance to build the right case much faster.
Knowing there is a more efficient way to look at data, what should agencies do to achieve results? A few key steps can help agencies of all sizes come to grips with the data challenges presented by complex investigations.
Technologies are not just new, interesting tools that improve existing processes; rather, they can open up entirely new ways of doing investigations. Where today, many investigators spend large amounts of time trying to find the right data, potentially leaving only a short time for analyzing it and piecing it all together, in the future, investigators should be able to rely on new data tools to more rapidly find data, allowing them to spend the majority of their time analyzing data. In 1970, a study of one medium-sized police department found that 50 percent of a patrol officer’s time was spent on tasks directly related to crimes (with the remaining half spent on administrative tasks).12 Some four decades later, a 2012 look at six different police forces found that this percentage has risen to 80 percent—a 60 percent improvement.13
The massive impact from using these seemingly simple tools can be seen by revisiting the story from the introduction: Faced with the rash of overdose deaths in her community, Captain Johnson convinces her agency to develop a robust analytics capability. First, her agency assesses its existing data sources, performing an audit that includes storage, management, access, and utilization. Data analysts perform an exploratory analysis to understand their data quality and comprehensiveness. Realizing that they lack consistent access to critical data, including communications, prescriptions, and geotagged business data, the team compiles publicly available sources, data subscriptions, and social media data from nearby locations. Agency leaders then agree on a plan to ingest, process, and store data in the cloud, enabling instant access to the latest computing systems and eliminating data silos. They hire a couple of data scientists and provide training for interested officers to develop their own data skills.
Captain Johnson’s team was collecting information on pharmacies believed to be illegally supplying local addicts with prescription opioids. Equipped with new capabilities around the ingestion, processing, and storing of bulk data sets, along with new personnel resources with advanced analytics skill sets, Captain Johnson’s team is able to establish a risk ranking of the practitioner prescribers and patients found in the pharmacy data and can quickly cross-reference it against available drug overdose information. Upon doing this, they discover that a number of the overdose victims had received prescriptions and/or had opioids dispensed from some of the pharmacies in question.
Next, through the bulk ingestion and exploitation of open-source property record data, a capability the team did not have in the past, it realizes one of the high-volume prescribing practitioners appears to have an excessively large number of financial assets. In addition, analytics professionals on the team determine that the husband of the practitioner in question is linked via social media to a known dealer involved in heroin and fentanyl trafficking, both illicit opioids.
Investigators then track available social media profiles of recent overdose victims and determine that many of the victims had connections with both the known drug trafficker and the practitioner’s husband. By taking the property addresses of both the practitioner’s husband and the drug trafficker and by cross-referencing them against commercially available foreign-based shipping records, investigators determine that a number of packages have been shipped to the properties from a known industrial park in China.14
Captain Johnson and her team then work with their representative on a local federal drug task force to connect to a limited access database and ingest sharable information, which shows that the industrial park is home to an unregulated pharmaceutical facility. Another agency is working on spotting the shipments before they enter the United States, and a number of other local agencies are fighting a similar problem. Now, investigators have enough information to pursue warrants for enforcement activity.
With these new tools and methods, investigators are no longer overwhelmed and hoping for a break. They are now actively targeting a criminal network and getting to key targets faster. Better data sources, coupled with targeting expertise, have allowed Captain Johnson and her investigators to quickly obtain better results and more significant investigative outcomes.
We’ve seen how investigators across law enforcement and intelligence organizations use sophisticated computer programs to document linkages between people, places, and things. While this represents a leap forward from the analog era, it still requires painstaking data collection before an investigator can plot, analyze, and understand a network of actors.
What if this process could be partially automated, to enable analysis from day one? As with Captain Johnson’s department, setting systems up for an analytics capability starts with understanding the landscape of available data sources, from in-house data sources to social media activity. Then, the data sources have to be able to speak to each other. That means designing a system that organizes, formats, and stores data accessibly. For example, by formatting records so that names, dates, and locations are easily searchable, an investigator can reach across dozens of data sources to compile histories, profiles, and activities automatically.
In figure 1, various data sources are compiled into a single, structured output. Inputs include unstructured data such as scans from documents; semistructured data such as websites and social media; and structured data including property records, professional license records, and criminal records. The output is a single, multidimensional profile of an individual of interest, just as Captain Johnson might be interested in finding information about an individual suspected of overprescribing opioids.
But investigators rarely need just one person’s profile; instead, they often need to understand multiple forms of connections, from known relationships, to presence in the same geographies or businesses, to shared connections on social media. Using automated data analysis to find links among multiple structured output profiles, investigators can develop networks of actors and understand activities and behaviors across a group (see figure 1). For example, natural language processing can perform what is called “named entity recognition.” In layman’s terms, this means that AI can use contextual clues to tell the difference between the State of Georgia, the country, and the first name, for example.15 For investigators, this can mean differentiating between a suspect and an innocent citizen with the same name.
These tools all serve to increase the accuracy of investigations and helping police find the right conclusions faster. Instead of spending weeks or months to develop a detailed link chart, they can leverage both stored data and live data streams to develop an advanced starting point for their analysis, allowing them to more quickly get results. The increased speed and accuracy of investigations can be attested by the Durham, North Carolina police department. That department used natural language processing to study incident reports and find patterns in criminal activity. With that information, they were able to identify areas with a high incidence of specific types of crime, allowing them to deploy the right assets to the right areas ahead of time. The end result was a 39 percent drop in violent crime in Durham from 2007 to 2014.16
It’s not always easy to change the way an agency goes about conducting its investigations. But as criminals become increasingly sophisticated, law enforcement and intelligence professionals have to harness all available data or risk being left in the dark. To be sure, new solutions, tools, and methods will not take root overnight. They need continued support and may require new technical and targeting capabilities for investigators, operators, and analysts alike.
Organizations have a natural inertia to keep doing things in the same ways, with the same tools. The more changes that better technologies force upon an organization, the more likely they are to be resisted.17 However, a few concrete steps can help any agency get started on the journey to investigative transformation:
While these steps are just the beginning, they can offer a path to a new era of investigations, one where law enforcement is not overwhelmed by data, but can harness it for good.