Clustering unstructured information in BrainSpace

Case study

Clustering unstructured information in BrainSpace

AI case 9/16 of applied artificial intelligence

In e-Discovery – the gathering of evidence from digital data – heaps of data have to be gathered and rendered understandable. But what exactly is a feasible way of searching through millions of documents within a short space of time? The e-Discovery team at Deloitte has spent three years working with BrainSpace, a smart tool that categorises data and renders it understandable with extraordinary speed and accuracy.

Millions of documents

BrainSpace is being used to assist in legal cases, as Anoeska Schipper, Data Analytics & e-Discovery manager at Deloitte Financial Advisory, explains: “If the authorities carry out a raid on the premises of one of our clients, we can be called to secure data for the purpose of preparing their defence. It usually concerns data on things like laptops, telephones and mail servers, for example.”

The e-Discovery team usually performs a full backup of the system. Schipper: “The client and its lawyers want us to tell them as soon as possible everything that is saved on these systems,” says Schipper. What can serve as evidence and what can be used to support their defence?”

The team has spent three years working with BrainSpace, a tool that uses machine learning and cluster analysis to search through unstructured data, such as emails, Word documents and PowerPoint presentations. Schipper explains how BrainSpace facilitates the search process: “BrainSpace shows what types of documents exist, and can make an initial selection based on our instructions. But it can also cluster data and provide a summary of them.” Among other things, BrainSpace can show what is being discussed and by which individuals, and how topics of discussion relate to one another in the found email correspondence, for example.

What is more, BrainSpace is self-learning: the tool gains new knowledge from each data set, and improves its ability to navigate each time. “If we indicate which documents are important to us, it recognises them automatically, and is really accurate at predicting which other documents are relevant by recognising patterns within text,” says Schipper. BrainSpace also assists with the presentation of relevant data. “For example, we can show our clients breakdowns in visually appealing formats, which make it clear to see at a glance what we have found.”

AI email alert

Receive the latest AI cases

Sign-up

Faster and more effective

BrainSpace categorises relevant data not only far more quickly, but also far more effectively than people can, asserts Schipper. “Some scientific studies have pitted a human review against a machine learning tool, and they showed that machine learning generates far better results.” This does not mean, however, that BrainSpace works without supervision. A random sample is taken from every review carried out by the tool in order to check how well machine learning is working on the entered data set.

BrainSpace is currently only being used for e-Discovery, but it is also suitable for broader application. Schipper: “We intend to be involved in an increasing number of cross-functional tasks. BrainSpace can also make a significant difference in contract analysis, such as when searching for clauses and categorising employment and lease contracts,” says Schipper.

*) This case is part of the series of 16 Artificial Intelligence projects from Deloitte. Other cases in the series are in random order:

  1. TAX-I: A virtual legal research assistant
  2. AI Benchmark 
  3. SONAR: Find labelling errors in databases
  4. Transaction detector with regard to the Dutch work cost regulations
  5. GRAPA: assistance with risk strategies
  6. Chatbot as a handy search tool for the online technical library
  7. Argus: an eye for detail
  8. PostNL: optimising delivery times
  9. Virtual assistants: beyond the hype
  10. HR agent Edgy: the future of Human Resources
  11. Using machine learning to assess risks for insurance policies
  12. Predicting payment behaviour
  13. DocQMiner: contract analysis performed in no time at all
  14. Combating welfare fraud with machine learning
  15. Using machine learning and network analytics to search for a needle in a haystack
  16. Clustering unstructured information in BrainSpace

Sign up for the email alert to get all cases through email.

Vond u dit nuttig?