Clustering unstructured information in BrainSpace
AI case 9/16 of applied artificial intelligence
In e-Discovery – the gathering of evidence from digital data – heaps of data have to be gathered and rendered understandable. But what exactly is a feasible way of searching through millions of documents within a short space of time? The e-Discovery team at Deloitte has spent three years working with BrainSpace, a smart tool that categorises data and renders it understandable with extraordinary speed and accuracy.
Millions of documents
BrainSpace is being used to assist in legal cases, as Anoeska Schipper, Data Analytics & e-Discovery manager at Deloitte Financial Advisory, explains: “If the authorities carry out a raid on the premises of one of our clients, we can be called to secure data for the purpose of preparing their defence. It usually concerns data on things like laptops, telephones and mail servers, for example.”
The e-Discovery team usually performs a full backup of the system. Schipper: “The client and its lawyers want us to tell them as soon as possible everything that is saved on these systems,” says Schipper. What can serve as evidence and what can be used to support their defence?”
The team has spent three years working with BrainSpace, a tool that uses machine learning and cluster analysis to search through unstructured data, such as emails, Word documents and PowerPoint presentations. Schipper explains how BrainSpace facilitates the search process: “BrainSpace shows what types of documents exist, and can make an initial selection based on our instructions. But it can also cluster data and provide a summary of them.” Among other things, BrainSpace can show what is being discussed and by which individuals, and how topics of discussion relate to one another in the found email correspondence, for example.
What is more, BrainSpace is self-learning: the tool gains new knowledge from each data set, and improves its ability to navigate each time. “If we indicate which documents are important to us, it recognises them automatically, and is really accurate at predicting which other documents are relevant by recognising patterns within text,” says Schipper. BrainSpace also assists with the presentation of relevant data. “For example, we can show our clients breakdowns in visually appealing formats, which make it clear to see at a glance what we have found.”
AI email alert
Receive the latest AI casesSign-up
Faster and more effective
BrainSpace categorises relevant data not only far more quickly, but also far more effectively than people can, asserts Schipper. “Some scientific studies have pitted a human review against a machine learning tool, and they showed that machine learning generates far better results.” This does not mean, however, that BrainSpace works without supervision. A random sample is taken from every review carried out by the tool in order to check how well machine learning is working on the entered data set.
BrainSpace is currently only being used for e-Discovery, but it is also suitable for broader application. Schipper: “We intend to be involved in an increasing number of cross-functional tasks. BrainSpace can also make a significant difference in contract analysis, such as when searching for clauses and categorising employment and lease contracts,” says Schipper.
*) This case is part of the series of 16 Artificial Intelligence projects from Deloitte. Other cases in the series are in random order:
- TAX-I: A virtual legal research assistant
- AI Benchmark
- SONAR: Find labelling errors in databases
- Transaction detector with regard to the Dutch work cost regulations
- GRAPA: assistance with risk strategies
- Chatbot as a handy search tool for the online technical library
- Argus: an eye for detail
- PostNL: optimising delivery times
- Virtual assistants: beyond the hype
- HR agent Edgy: the future of Human Resources
- Using machine learning to assess risks for insurance policies
- Predicting payment behaviour
- DocQMiner: contract analysis performed in no time at all
- Combating welfare fraud with machine learning
- Using machine learning and network analytics to search for a needle in a haystack
- Clustering unstructured information in BrainSpace
Sign up for the email alert to get all cases through email.
The cost effective, quality focused and faster way to retrieve data points from unstructured documents
AI case 5/16: early warnings for credit migrations