Using machine learning and network analytics to search for a needle in a haystack

Case studies

Using machine learning and network analytics to search for a needle in a haystack

A corporate client contacted Deloitte with a serious problem: the company had been charged with bribing public officials. It needed to find out quickly what exactly had happened in order to prepare its defence. Could Deloitte assist in determining whether unacceptable transactions had indeed taken place?

“The period to be investigated was twelve years,” explains Christian Cnossen, Manager Financial Crime Analytics at Deloitte. “It meant that we had to assess over 80 million payments and hundreds of millions of emails and internal documents. It was like looking for a needle in haystack.” In order to set up an effective investigation, Deloitte developed a method that combined machine learning and network analytics with clever human detective work.

AI email alert

Receive the latest AI cases


One hundred and fifty names and entities

The first stage consisted of classifying internal documents and emails in order to retrieve relevant information. To do this, Wesley van Saane of Deloitte’s Forensic Discovery team used a machine learning module from the Relativity software suite and the categorisation and visualisation tool, Brainspace. Lawyers checked through a few hundred documents manually to indicate whether or not they were relevant to the investigations, and these were used to train the system. “We carried out a few of these iterations, which gradually made the results more accurate,” explains van Saane. It meant that, ultimately, the lawyers actually had to study only a fraction of the millions of documents.

Around fifty names of individuals and entities emerged from this investigation that were potentially implicated in the alleged corruption. Cnossen’s team then built a custom-made tool to perform a network analysis of the relationships between these individuals and entities. Searches of publicly accessible online sources, such as the Paradise Papers and OpenCorporates revealed around one hundred new names that were potentially implicated in the case, such as directors who did not appear in correspondence, or unnamed subholdings.

The next stage was to search through the client’s accounts for transactions involving these one-hundred-and-fifty individuals and entities, and payments were checked against the suspicious emails and documents for any links between them. Ultimately, the team were able to identify around thirty payments that were related to the accusations.


It was, of course, unfortunate for the company that bribery had taken place. However, the company was pleased that incidents could be traced quickly in order to produce a swift and cogent response to the allegations. This enabled the matter to be settled relatively quickly with the relevant supervisory authorities. The investigation also shed light on the business processes that had allowed the undesirable payments to take place, and the company was able to take action accordingly.

Machine learning and advanced analytics can be of considerable assistance in carrying out an investigation involving large quantities of data. As far as potential legal proceedings are concerned, however, it is important that all parties concerned understand the technology fully and trust it, cautions Cnossen. “The lawyer must be able to explain the process, and the court or the Public Prosecutions Service must consider the approach acceptable. Thanks to our method of working, we were able to provide detailed statistical substantiation, and the parties concerned were convinced of the outcome.”

*) This case is part of the series of 16 Artificial Intelligence projects from Deloitte. Other cases in the series are in random order:

  1. TAX-I: A virtual legal research assistant
  2. AI Benchmark 
  3. SONAR: Find labelling errors in databases
  4. Transaction detector with regard to the Dutch work cost regulations
  5. GRAPA: assistance with risk strategies
  6. Chatbot as a handy search tool for the online technical library
  7. Argus: an eye for detail
  8. PostNL: optimising delivery times
  9. Virtual assistants: beyond the hype
  10. HR agent Edgy: the future of Human Resources
  11. Using machine learning to assess risks for insurance policies
  12. Predicting payment behaviour
  13. DocQMiner: contract analysis performed in no time at all
  14. Combating welfare fraud with machine learning
  15. Using machine learning and network analytics to search for a needle in a haystack
  16. Clustering unstructured information in BrainSpace

Sign up for the email alert to get all cases through email.

Did you find this useful?