There has been a dawn raid with 100,000 documents to review. What is the process from here?

The dawn raid has led to the forensic collection of 100,000 documents, now safely secured on a hard drive. What is the process from here? It's important to plan your strategy in advance to minimize downtime, extract relevant documents and get ready for production.

Data Culling

The first thing to do with your core data set is reduce it to a manageable volume. There are a number of tools that can be employed here, the first being de-duplication. De-duplication works by identifying identical documents bearing the same MD5 Hash Values. This is applied at a global level to remove exact copies of standalone documents but also duplicate emails, resulting from multiple mailboxes being extracted and removing emails between mailbox custodians. This can quickly reduce your document set by 10-20% on average.

Email threading is a similar tool that can be applied to emails. Email threading takes those long chains of back and forth emails and removes all "early" emails, leaving just the final email chain in the data set for review (allowing potential reduction of 20-30% of an email set). It also recognises where emails split off from the main conversation and prioritises the documents so those two chains are sequential. This enables reviewers to ascertain the stories behind the emails more quickly and review more efficiently.

There are also tools that can allow Early Case Assessment such as Foreign Language identification, Near Duplicate Analysis and Clustering.

Know What You're Looking For

The next stage is to take advantage of the forensically sound collection of your data and its preserved metadata by running some matter-specific identification of relevant material. This can involve restricting the data set to the specified data ranges pertaining to your matter, therefore further reducing the data set. In addition, keyword searches should be run across the set for numerous purposes, identifying:

  • Potentially relevant documents to be included in review;
  • Documents to be excluded such as irrelevant deals, projects etc;
  • Documents to be segregated for different review teams/levels e.g. materials from a CFO which should be reviewed by Partner/General Counsel, etc rather than First Pass reviewed; and
  • Key custodians to be prioritised or removed.

In addition, you should take this opportunity to consider whether your newly culled dataset is of such a size as would benefit from implementation of Technology Assisted Review (TAR) to either prioritise or further reduce the documents for review by inputting review decisions in an algorithm – akin to letting Netflix suggest your next film or TV show to binge-watch based on the TV shows you have watched to date.

Managing Your Resources

The next query is who will be undertaking your review? Do you have a review team in-house or will you need to hire additional resources? Alternatively, you could engage the services of a third party to leverage their teams of reviewers to undertake First Pass Review, effectively culling irrelevant documents and allowing your own reviewers to focus on key documents and preparation of legal strategy, making the most of their valuable time. You should also consider whether you require foreign language reviewers or translation services. In addition, depending on the matter, you will need to consider whether expert reports should be prepared and which experts to engage.

Keep an Eye on the Clock

A final step with your documents – make sure you know your deadlines! Your matter timeline will impact decisions in relation to resourcing, as well as when to engage experts, Counsel and prepare evidence.

It's a lot to think about but spending time properly planning your project will save you valuable time in the long run.