Discovery – Top 5 questions to improve your discovery process

Forensic Focus November 2015

In a complex business dispute, litigation or regulatory investigation, all organisations face one certainty – the discovery process. Whether you are a litigation lawyer at a law firm or in-house counsel, responding to a discovery request or regulator’s notice involves a myriad of complex requirements for proper data collection, processing, hosting, review and production.

We officially marked the launch of Relativity in New Zealand by hosting Discovery workshops across the country during September and October 2015.  The workshops generated lots of interesting thoughts and questions – in this article we set out our five favourite questions that arose during the Discovery workshops that may help your discovery process next time you and/or your clients are required to respond to a discovery order, regulatory notice and/or gather documents to support an investigation:

  • Do I need to collect all of the documents?

  • Why do I have duplicate documents in my review set?
  • How can I conduct the review easily and efficiently when multiple parties are involved?

  • What can assist the reviewers to conduct the review more efficiently?
  • How can I ensure that the reviewers are reviewing consistently and/or productively?

Before setting out our answers, we first provide an overview of the discovery process to provide context to the questions.

What is Relativity?

Relativity is an advanced web-based (we host it in a Deloitte data centre) document review platform that is purpose built for investigation and litigation purposes. It offers three advantages over traditional review platforms:

  • Web based – being web based makes it easy to have multiple parties (think client, external law firm, barrister and experts) review the documents simultaneously, providing “one source of the truth”;
  • User friendly - it is extremely user friendly, yet has serious analytical firepower “under the hood” including the likes of “concept clustering”, which groups conceptually similar documents together;
  • Efficient: It makes the document review process considerably more efficient and robust. For example, Relativity robustly “threads” e-mail chains together – enabling a reviewer to review related e-mails in one go, which is vastly more efficient and reliable than traditional review methods. Relativity also offers the ability to batch documents to reviewers and build customised workflows, offering the review manager greater control and confidence over the review process.

A brief overview of the Discovery process

To assist us to address these questions, it is helpful to firstly summarise the main stages of the Discovery process (leaving aside production and exchange).

  • Identification & Collection:  The Identification & Collection phase is where you and/or your client will consider what document sources you have and whether or not these should be collected for the purposes of the Discovery Order, Regulatory Notice or investigation; 
  • Processing: During the Processing phase, all collected documents are processed.  This includes: 

Electronic documents

  • Indexing the documents into a searchable database;
  • Extracting the “meta data” of each document (e.g. author, date, time, recipient, sender etc); 
  • Identifying and removing certain file types such as system files (i.e. those documents generated by the computer and not by a human-being) and corrupt files; 
  • Removing duplicate documents; 
  • Converting scanned PDF documents and other image files into text searchable format using Optical Character Recognition (“OCR”) technology.  This is necessary because the content of these documents is otherwise not searchable;
  • Identifying password-protected / encrypted documents.  Depending on the fact situation, it might be necessary to attempt to decrypt the documents.  The content of these documents is not searchable, nor able to be reviewed, unless this occurs.

Hardcopy documents

  • Hardcopy documents are scanned and converted into text searchable format using OCR technology;
  • The hardcopy documents are “objectively coded” (i.e. date, document type, author, etc);    
  • Filtering:  In many cases, it will be necessary and/or preferable to apply filters to target the document review (filtering will occur for almost 100% of cases involving electronic documents, given the volumes).  Common filters applied include date ranges and keywords;
  • Review: The Review phase is where the reviewers review and “code” the documents (both electronic and hardcopy) in the review platform. The needs of each review project and reviewer can vary significantly from project to project.  For example, litigation lawyers are often focussed on issues such as relevance and privilege during the review process, whereas expert witnesses will be focussed on understanding how the evidence available will shape their opinion.

Do I need to collect all of the documents?

In most cases: no, you will not need to collect all of the documents.

Most organisations, from sole traders to large companies and public sector entities will typically have documents in many, if not all of the following locations: servers, desktop computers, laptops, removable media (e.g. USB and removable hard-drives etc), mobile devices, cloud, instant messaging (e.g. Skype for Business, etc), audio files, social media, websites, databases, document management systems, archive media such as back-up tapes as well as traditional hard-copy documents. 

Practically the cost involved in collecting, processing and reviewing all of these potential sources will be prohibitive in many cases.  The High Court Amendment Rules (No 2) 2011 (“the High Court rules”) helpfully enable parties involved in discovery to consider proportionality.  High Court Rule 8.14 requires parties to make a reasonable search for documents. What is reasonable depends on the circumstances and includes factors such as the numbers of documents involved, the ease and cost of retrieving a document and the need for discovery to be proportionate to the subject matter of the proceeding.

There is a balance here between risk and cost.  Collecting more documents will reduce the risk that certain documents are not collected; however it will often increase the costs and timeframes to review all of these documents.  Accordingly, we recommend a two stage process:

  • Document all of the potential sources of documents; and
  • Document which document sources will/will not be collected, along with the reasons why documents were not collected.  This is especially important since you could be questioned about non-collection several months, if not year, later.  Carefully documenting the decisions made should position you to more credibly resist a challenge in the future from the counterparty about why certain documents were not collected.

Your legal and technical discovery advisors should be able to assist you with this process.    

Why do I have duplicate documents in my review set?

A common question that was raised at the workshops and an issue that causes a lot of “reviewer pain” is duplicate documents.

As detailed above, some duplicate documents are automatically removed in the processing phase. De-duplication in the processing phase only occurs:

  • At “parent” level.  We return to this below; and
  • Where the duplicates are exactly the same (we use the industry standard MD5 hash algorithm to identify duplicates).  As little as one character difference will result in the documents no longer being unique and therefore not automatically de-duplicated.

Please click on image to enlarge

Some duplicate documents must be retained in order to comply with the High Court Rules. The High Court Rules require that de-duplication occurs at the parent level. Therefore, consider this scenario:

  • A file note is typed in Word and saved to your desktop computer (Document B);
  • The same file note (Document A1) is also is attached to an email (Document A); and
  • The same file note (Document C) is also saved in a document management system.

Because de-duplication must occur on the parent level, in this scenario only Document C (or B) can be removed, while complying with the High Court Rules. The rationale behind this is that if Document A1 was also removed, then the review of the email A would likely be incomplete – the email would likely refer to an attachment no longer attached. 

Additionally, Document B might have been printed and scanned into pdf format. The pdf version of Document B would not be automatically de-duplicated in the processing phase because it is a different file format and will not have the same MD5 hash as the Word version. Therefore, you might have a further ‘duplicate’ document in your review set.  We return to this issue below when we discuss “near duplicates”.

Please click on image to enlarge

How can I conduct the review easily and efficiently when multiple parties are involved?

One area of “reviewer pain” that was expressed in the workshops is where there are multiple reviewers located in different locations, for example many investigations, disputes and responses to regulatory matters require multiple parties to review the documents, including client, law firm, barrister and expert witnesses.

Solutions in the past included running multiple databases (with drawback of the databases getting out of sync – documents and/or comments on one database not shared with other databases) or physically providing access to multiple parties at a single physical location (creating logistical and often security issues).

The solution for this scenario is for reviewers to use a secure, web based review platform such as Relativity. This way all reviewers can log into the same database, wherever they are located. The reviewers will all be using and up-dating the same database in real time, resulting in “one source of the truth”.

Please click on image to enlarge

What can assist the reviewers to conduct the review more efficiently?

There are two key factors determining the overall level of effort required in the review stage:

  • The number of documents;
  • The efficiency of the review teams.

We have found that there has been considerable focus in New Zealand on limiting the number of documents for review by applying various filters.  However, our experience is that until recently there has been less focus in New Zealand on lifting the efficiency of the review team.  Our experience is Relativity provides the ability for the review process to be considerably more efficient because:

  • Of the ease at which highlights, redactions and comments can be applied to documents; and more importantly
  • The functionality available to group related documents together makes the review process both more efficient and robust because it minimises the “mental gymnastics” that a reviewer would otherwise have to carry out, jumping from topic to topic with each new document they review.  

Two examples of this grouping functionality to make the review more efficient and robust:

  • Near duplicates: As detailed above, if a document has even one character different, it will no longer be considered a duplicate document by the processing engine and automatically excluded.  However there is technology built-in to Relativity that will identify near duplicate documents and group these together. These could include the different versions or drafts of a document; pdf versions of the same or similar Word documents; or other documents that are similar (e.g. contracts or agreements along the same theme). 

    Grouping near duplicate documents together to review concurrently can assist the reviewer efficiency as the reviewers can quickly and easily review along the same theme. The reviewer can also quickly and easily see what is different between the two similar documents – whereas if they had reviewed the documents say 500 documents apart, the reviewer would instead probably be thinking something like “haven’t I seen this document before?” or “is this a duplicate?”, “how did I treat the other version of this document?” etc, which slows the review down.       
  • Email threading: Another common “reviewer pain” that was raised by attendees at the workshops was email chains. Consider the following example


If you reviewed this email chain out of order, or if different reviewers reviewed different emails in the chain, there is a risk that they might be coded inconsistently. It is also frustrating and annoying for reviewers to read what feels like the same email over and over again – albeit different sections of the email chain is being reviewed at different times.

Relativity groups emails in the same chain together to assist with the efficiency of the review. That way one reviewer can review the same email chain at once and also they only need to review the “inclusive” emails in the chain – i.e. the end email in this example (as it contains all prior emails in the chain) and the second email (as it contains the attachment, which is normally dropped off in subsequent replies). As such, email threading technology can greatly assist with the efficiency of the review.



Please click on image to enlarge

How can I ensure that the reviewers are reviewing consistently and/or productively?

The final common and very important question covered at our recent workshop series was about how review managers can get comfort that the reviewers are reviewing consistently and/or productively. 

Monitoring review “metrics” on the number of documents being reviewed and coding decision making provides fantastic insight into how the team is progressing.  

In this simple example, “Reviewer 3” has reviewed approximately 400 documents per day, whereas his colleagues have reviewed approximately 100 documents per day.  Interestingly Reviewer 3 has also coded 100% of the documents as ‘Relevant’.  On closer examination, it turned out this reviewer was a poor performer – simply coding all documents as relevant without review, to free up time for his personal web viewing. 

“Reviewer 1” has a reasonably similar throughput as the other reviewers, but has coded a very high proportion of the documents to ‘Unsure’ – creating significant rework downstream.  After making enquiries, it was found that that Reviewer 1 needed further guidance on the instructions.

Use of these reviewer statistics, which can be easily tracked and graphed in Relativity, makes it easier to identify quality and productivity issues early in the review process, reducing cost and risk.  This same information also enables the review manager to answer questions such as:

  • Will we meet the deadline, based on the current review speed?
  • Do I need more reviewers to meet the deadline?
  • Do my reviewers understand the reviewer instructions? (e.g. Reviewer 1 above. Or if there are a large number of coding decisions consistently being over-turned by the Level 2 reviewers, this might also indicate they are not sure of the instructions)
  • Are there any concerns with my reviewers? (e.g. Reviewer 3 above. Or is one reviewer coding too slow, inconsistent with other reviewers etc?)

Identifying this type of information early on and regularly throughout the review phase can be very beneficial as it enables you to make decisions and change the structure or instructions of the review in order to meet the deadline and conduct an efficient and robust review.

Please do not hesitate to contact Jason Weir, Mel Maddox or Silas Dich if you would like to discuss contents of this article, have any other Discovery questions, or if you would like a demonstration of Relativity.

Please click here to find out more about Deloitte Discovery and to download our brochure.

Disclaimer: This article is not a substitute for legal and forensic advice that should be obtained about how the Discovery process should be conducted.


Please click on image to enlarge
Did you find this useful?