Article
What is predictive coding and how can it help the discovery process?
Forensic Focus
Predictive coding has been used extensively in the USA and it is starting to be used in the UK. In the recent English judgment of Pyrrho Investments Limited v MWB Property Limited, the Court directed that the parties could use predictive coding in their Discovery process. This was the first decision of its kind in England and provides insights into the emerging technology available to make the Discovery process more efficient.
We’ve been watching the international developments of Discovery and the emerging technology being utilised with interest. As the volume of data increases and as the Discovery technology continues to evolve, it is becoming clear that the limitations with traditional approaches (e.g. relying on keyword searching) are getting to the point where new approaches are required for the larger discoveries in New Zealand.
In this article we discuss what predictive coding is and how we see it being used in the New Zealand Discovery context in the future.
What is predictive coding?
Predictive coding is also referred to as ‘technology assisted review’, ‘computer assisted review’ or ‘assisted review’. As the name suggests, it utilises technology to assist with the review of the documents as part of the Discovery process.
For predictive coding to work, firstly the computer needs to ‘learn’ the coding decisions. In very simplistic terms, this might involve a single senior lawyer (who is fully aware of the issues in the case) coding a representative sample of documents (say 1,000 documents) for relevance. The computer will ‘learn’ from this coding and it will then apply the coding decisions to the balance of the documents. Lawyers then check the reliability of the coding decisions made by the computer and the coding by the computer is refined. This Quality Assurance exercise will continue to be carried out until the agreed margin of error has been reached and the senior lawyer is happy with the relevance coding completed by the computer. Once the agreed margin of error has been reached, the computer is left to code all of the remaining documents as relevant or not. The lawyers would then focus the resources on the documents the computer has selected as relevant. The documents marked not relevant by the computer (which tend to be the majority in most discovery projects) are then discarded.
When would you consider using predictive coding?
When faced with significant volumes of documents to review. As is noted in the Pyrrho Investment decision, the volume of documents that needed to be reviewed was massive – some 3.1 million documents after de-duplication. Because of the volume of documents to review, predictive coding was considered to more efficient and economical, thus saving hundreds if not millions of pounds in the review.
As the volume of data (particularly electronic data) continues to increase exponentially, technology such as predictive coding will be more and more essential to carry out efficient and cost-effective Discovery exercises in the future.
How accurate is predictive coding?
Research suggests that predictive coding is at least as accurate, if not more accurate than manual review of documents. As is noted in the Pyrrho Investment judgement, the Irish High Court also endorsed the use of predictive coding, in Irish Bank Resolution Corporation Ltd v Quinn [2015] IEHC 175. The Irish Judge, Fullam J, said:
"66. The evidence establishes, that in discovery of large data sets, technology assisted review using predictive coding is at least as accurate as, and, probably more accurate than, the manual or linear method in identifying relevant documents. Furthermore, the plaintiff's expert, Mr. Crowley exhibits a number of studies which have examined the effectiveness of a purely manual review of documents compared to using TAR and predictive coding. One such study, by Grossman and Cormack, highlighted that manual review results in less relevant documents being identified. The level of recall in this study was found to range between 20% and 83%. A further study, as part of the 2009 Text Retrieval Conference, found the average recall and precision to be 59.3% and 31.7% respectively using manual review, compared to 76.7% and 84.7% when using TAR. What is clear, and accepted by Mr. Crowley, is that no method of identification is guaranteed to return all relevant documents.
67.. If one were to assume that TAR will only be equally as effective, but no more effective, than a manual review, the fact remains that using TAR will still allow for a more expeditious and economical discovery process…” [Emphasis added]
Also the Pyrrho Investment decision noted that predictive coding would result in greater review consistency across the whole document set, than when humans code the whole document population.
How are we expecting predictive coding to be used in New Zealand?
Predictive coding has not yet been used extensively in Discovery in New Zealand. We continue to watch the global Discovery developments with interest to identify how emerging technology such as predictive coding can be best leveraged in New Zealand.
We expect that predictive coding will be used in New Zealand in the near future as follows:
- Predictive coding will initially be applied to a handful of cases over the next few years. We expect that initially it will be used where the volumes of documents for review are overwhelming (like in the Pyrrho Investments situation); and
- Once it has been ‘proved’ and the communities’ confidence in predictive coding increases, we then expect predictive coding to become more commonplace, particularly given the seemingly never ending increase in the number of documents being encountered.
In the interim, we are already seeing a sharp uptake in other analytical techniques such as near duplicates and email threading that significantly improve the efficiency of the document review.
Please do not hesitate to contact Amy Dove or Catherine Davidson if you would like to discuss to discuss contents of this article or have any other discovery questions.