start button

Perspectives

Entity recognition: How electronic discovery can benefit

Making sense of today’s discovery landscape

With the challenge of scouring through big data, a new technology—entity recognition—emerges to assist electronic discovery professionals and review teams in providing a quicker and more focused review.

A challenge

Today’s electronic discovery is characterized by high volumes of unstructured data. At the outset of most cases, discovery teams can be overwhelmed by content as they assess the data and determine what to look for and where to begin. The traditional method of keyword searching to discover a specific phrase or document within terabytes of emails, spreadsheets, and other files can be daunting. Coupled with increasing pressure to hasten the review phase and reduce overall discovery costs, discovery professionals must overcome steep hurdles to keep up. This is where entity recognition can help.

What is entity recognition?

Entity recognition is a new technology that can enable investigators and document review managers to get to the pertinent data faster. In this context, "entities" are components of text that are assigned to pre-determined concepts such as places, people, organizations, and products. Entity recognition utilizes machine learning methods in an effort to identify, extract, and place these components of text into metadata fields.

fiber optics

How does entity recognition work?

Entity recognition uses statistical modeling, neural networks, and regular expression pattern mapping in an effort to extract and classify each entity. Models are trained to find entities based on concepts, while rules conveyed as regular expressions are designed to find entities based on patterns such as dates, email addresses, and social security numbers. Many string patterns are built in pre-defined databases and can be customized by providing custom entity lists and adding or editing specific rules. For each entity extracted, the system chooses the approach to provide the top results.

purple circles

Some potential benefits of entity recognition

Reduced review time. Knowing the entities and also the relationship between the entities, the time needed to comb through unstructured data, determine the scope and analyze the results can be greatly condensed. Being able to quickly target relevant documents or form a review based on recognized concepts can reduce hours of frivolous searching and review.

More focused discovery. When looking for relevant information, identifying the specific entities involved can generate precise and targeted searches. Being able to pinpoint entities can assist with traditional methods by determining the leading keywords or concepts to use.

Reduced risk. Entity recognition can help efficiently distinguish documents with privileged data or personally identifiable information (PII), including social security and credit card numbers. Classifying these entities can reduce privacy risk and organize privilege reviews.

blue digital globe
Did you find this useful?