When dealing with data, whether it is for the purposes of IT governance, litigation dispute or internal investigation, always consider data as evidence.  Evidence that can ultimately end up in court and, as such, needs to be preserved and protected with a proper, forensically sound workflow.

The definition of data is pretty broad nowadays.  From your traditional hardcopy documents, all the way through to emails, files and even as far as your voicemail and chat messages; it can all be used and should be treated as evidence. With the speed in which data is created presently, the vast amount of it and all the varieties, it is important to understand the way in which preservation, collection and presentation can be done in a forensically sound manner without destroying the integrity of the data.

Different countries also treat data privacy issues with varying degrees of seriousness.  From Singapore’s Personal Data Protection Act 2012 until the recent enforcement of General Data Protection Regulation (GDPR) in Europe, companies and their legal departments will also have to focus not just on how to change the way they handle customer data, but on how to handle data pertaining to their own employees as well.  It is also important that any forensic data collection, preservation or examination is compliant with the applicable laws.

One thing to note is that data collection is different to data preservation.  Data preservation ensures potentially relevant data is not deleted.  This means any documents or emails that are subjected to preservation are kept until such time as they are no longer evidence.  This also means any systems, procedures or policies that can potentially destroy or delete these documents is disabled or is put on hold.

Data Collection

Data collection involves obtaining documents from source data repositories and copying them into the target locations for further analysis.  As stated above, it is important that whatever technique is used to collect this evidence it does not tamper with the integrity and originality of the data.  Several common categories of data that might need to be collected for the purpose of investigation or dispute include:

  1. Active – real time data such as emails and other traditional files that are stored on a local hard drive or network drive.
  2. Cloud – data created and stored on cloud servers, which includes social media accounts, DropBox and One Drive. These have grown exponentially in the past several years and the volume of cloud data will overtake the volume of active data in the not too distant future.
  3. Mobile – iPads, mobile phones and wearable devices can contain key evidence and collecting data from these devices requires advanced tools and highly specialised expertise.  Typical data that these tools can extract range from call logs and text messages, all the way to GPS locations, third party apps and health related data.
  4. Offline – data that is no longer in active use but is stored or archived, often by third party providers.  Obtaining and collecting this data can present some challenges due to its age and the technology and expertise required to extract it.
  5. Hidden – previously deleted or fragmented data that is not usually visible to regular system users.  These files need to be recovered using specialised tools and often can present evidence especially in investigation and criminal matters.

Considerations for Self-Collection

As a company’s IT teams become more advanced, they tend to perform their own collection in order to save costs.  It can be tempting to start collecting data, but from a forensic point of view, it could create more work as evidence could be missed which could make key evidence inadmissible in court.

Before collection starts, it is important to consider - why is the identification, preservation and collection of evidence in a litigation or regulatory investigation critical?

There is:

  • an obligation to provide complete and defensible Discovery in a litigation, product or in a regulatory investigation;
  • an obligation to preserve data in a current or reasonably anticipated litigation;
  • an obligation to file an affidavit around completeness of your discovery;
  • the need to find all critical documents to support and defend the case. It is necessary to know what adverse documents may have been created that the opposing party may use; and
  • the need to not give the opposing party reason to challenge the admissibility of your critical documents in court.

If challenged, the collection may need to be defended through testimonial in court by the person who carried out the collection.

Risks of Self-Collection

Consider that documents sourced for an internal investigation can become evidence and the same considerations should be applied to identifying and collecting those documents to avoid them being open to admissibility challenges by other parties.

Key Risks when carrying out a collection without expert help:

  • Omitting preservation or collection of data, such as deleted files or internet history, that can only be accessed with specialist forensic tools.
  • Inadvertently changing document metadata due to the ease with which electronic documents can be modified.
  • Not maintaining audit records of collection and chain of custody reports. The chain of custody must account for the seizure, storage, transfer and condition of the evidence logs to prove the authenticity of the collected documents in court.
  • Inability to efficiently collect alternate data sources like mobile phones.

Common Scenarios

Copying files to a USB:

  • Each time a file is copied to another location, it will potentially change the metadata. This means the file creation/modification/accessed times could be changed.
  • When trying to find relevant files, searching for files on a computer can update critical metadata or file and operating system artifacts.

Emails:

  • Similar to copying files to a USB, if emails are forwarded, this can change the metadata and the contents of the evidence.
  • When searching emails, it may result in critical emails being missed and the non-searchable text documents (eg. PDF attachments) will not be identified.
  • The company may have email archive solutions that require extra steps to ensure the complete email content is retrieved.
  • If emails are copied from Outlook onto the desktop, this will also alter the metadata.

Mobile Phones:

  • Unlike a laptop, it is not possible to just copy a file from the phone. A screenshot is creating new data on the device which in turn is changing the evidence and can easily be challenged in Court.
  • The majority of the phones on the market can easily be remotely wiped. A forensic image of the phone will ensure you are able to retain a snapshot of the device at that particular point in time.
  • Are dynamic and constantly changing. The relevant data or the artifacts around that data may not exist at a future time.

What Evidence is Being Sought?

Another key consideration to self-collection is what evidence is being sought. During an investigation, a critical document may be identified on a laptop that relates to the investigation, but what is surrounding that data needs to be considered.

On that laptop, there may be:

  • Deleted files that could be recovered that would not be visible to a user having a cursory glance over through File Explorer.
  • Multiple users that could have created the critical document, which would require analysis to identify the perpetrator.
  • Evidence of distribution of the critical document. It may have been shared through internal or external emails or uploaded to the Cloud.
  • Internet History that relates to the critical document. For example, a Google search relating to How to Beat a Non-Compete Clause or How to Steal Intellectual Property without your Boss finding out.
  • Inculpatory evidence that needs to be considered.
  • Additional offences that have occurred that have not been identified yet.

In summary, all data should be treated as potential evidence that could end up in court. When data needs to be collected, it should be done in such a way that the integrity of the data is not compromised. The risks of attempting to collect data without expert help are manifold and may make the data open to an admissibility challenge in court.