A. INTRODUCTION

1. 'Big Data is everywhere'. 'If you haven't heard' trumpeted the Financial Times' Lex column of 27 June 2014, 'Big Data is everywhere'.1 Over the past twenty years, the bow wave in IT has moved on from hardware and software to the data that they process, and in an increasingly competitive and data-centric world, harnessing the tides of the Big Data ocean will confer competitive advantage in enabling a company to know more about its customers and market place than its competitors.

Commenting that the business intelligence and analytics ('BIA') software market is worth $16bn a year and growing at 8% a year, the FT Lex column called out research from consultancy Gartner Inc.2 who showed that the BIA market is currently undergoing an 'accelerated transformation' from retrospective BIA software - used mainly for measurement and reporting - to prospective BIA software used for prediction, forecasting and modelling. This is fuelling a race as the BIA software majors – Oracle, SAP, IBM and SAS, whose combined BIA software turnover totals around $10bn – vie with smaller, faster growing BIA specialists like QlikTech, Splunk and Tableau to bridge the gap between the oceans of available Big Data and BIA software's ability to harness it for competitive advantage in a structured, legally compliant way.

The European Commission (Commission) in its Communication of 2 July 20143, quoting a UK report, also comments on this accelerating growth:

"Big data technology and services are expected to grow worldwide to USD 16.9 billion in 2015 at a compound annual growth rate of 40% – about seven times that of the information and communications technology (ICT) market overall. A recent study predicts that in the UK alone, the number of specialist big data staff working in larger firms will increase by more than 240% over the next five years."

It is this race for competitive advantage – knowing more than your competitor not so much about what your customers have just done as about what they are likely to do next – that is at the commercial epicentre of Big Data. But it is a race that is just beginning: Gartner also points out4 that only 15% of Fortune 500 companies will be able to exploit Big Data for competitive advantage by the end of 2015 and that only 8% of companies are currently using Big Data analytics at all.

2. The US NIC's December 2012 report. Big Data's direction of travel is well signposted in the December 2012 long range report of the US National Intelligence Council 'Global Trends 2030: Alternative Worlds'5 where it articulates a focus on data solutions and Big Data as a key IT driver over the next two decades:

"Information technology is entering the Big Data era. Process power and data storage are becoming almost free; networks and the cloud will provide global access; and pervasive services; social media and cybersecurity will be large new markets."6

Opportunities arising through Big Data are not without their challenges and issues however:

"Since modern data solutions have emerged, big datasets have grown exponentially in size. At the same time, the various building blocks of knowledge discovery, as well as the software tools and best practices available to organizations that handle big datasets, have not kept pace with such growth. As a result, a large - and very rapidly growing - gap exists between the amount of data that organizations can accumulate and organizations' abilities to leverage those data in a way that is useful. Ideally, artificial intelligence, data visualization technologies and organizational best practices will evolve to the point where data solutions ensure that people who need the information get access to the right information at the right time - and don't become overloaded with confusing or irrelevant information."7

It is these challenges and issues that the fast growing BIA software market is seeking to address.

3. What is 'Big Data'? As used in this White Paper, 'Big Data' is shorthand for the aggregation, analysis and increasing value of vast exploitable datasets of unstructured and structured digital information. Along with Cloud8, mobile9 and social computing, it is one of the four main drivers of change in information technology as it moves into new areas whose features currently include machine learning, 3D printing, virtual reality, the Internet of Things and nanotechnology.

Two recent papers, one from each side of the Atlantic, have addressed Big Data. Commenting that there was no one generally accepted definition, the White House's Executive Office of the President (EOP) in a report dated 1 May 201410 nevertheless gave a useful description:

"Most definitions reflect the growing technological ability to capture, aggregate, and process an ever-greater volume, velocity, and variety of data. In other words, "data is now available faster, has greater coverage and scope, and includes new types of observations and measurements that previously were not available."11 More precisely, big datasets are "large, diverse, complex, longitudinal, and/or distributed datasets generated from instruments, sensors, Internet transactions, email, video, click streams, and/or all other digital sources available today and in the future."12

The Commission in its Communication of 2 July 2014 referred to above gives a similar description, which also covers the analytics aspects:

"The term "Big Data" refers to large amounts of different types of data produced with high velocity from a high number of various types of sources. Handling today's highly variable and real-time datasets requires new tools and methods, such as powerful processors, software and algorithms, [g]oing beyond traditional "data mining" tools designed to handle mainly low-variety, small scale and static datasets, often manually"13.

Big Data is therefore characterised by:

  • aggregation:

    • size – vast volumes of digital data;
    • shape – in many variable formats (text, image, video, sound, etc.);
    • structure – in unstructured (typically, 80%) as well as structured (typically, 20%) varieties;
    • speed – arriving at a faster velocity;
  • analysis:

    • these aggregated datasets analysed on a real-time rather than batch basis;
    • by quantitative analysis software (using artificial intelligence, machine learning, neural networks, robotics and algorithmic computation);
    • enabling a shift from retrospective to predictive insight;
  • increasing value:

    • facilitating small but constant, fast and incremental business change;
    • enhancing competitiveness efficiency and innovation and the value of the data so used.

4. The policy perspective – the Commission's July 2014 Communication. The Commission Communication of 2 July 2014 Towards a thriving data-driven economy referred to above sets out a number of activities it considers necessary "to be able to seize [Big Data] opportunities and compete globally in the data economy" including:

  • supporting 'lighthouse' data initiatives (like personalised medicine in healthcare, integrated regional transportation management and food chain management tracking food from farm to fork);
  • focusing public research and investment 'on technological, legal and other bottlenecks';
  • making 'sure that the relevant legal framework and policies, such as on interoperability, data protection, security and IPR are data-friendly, leading to more regulatory certainty for business and creating consumer trust in data technologies';
  • rapidly concluding 'the legislative processes on the reform of the EU data protection framework, network and information security' and 'supporting exchange and cooperation between the relevant enforcement authorities (e.g. for data protection, consumer protection and network security)';
  • accelerating 'the digitisation of public administration'; and
  • using 'public procurement to bring the results of data technologies to the market'.

5. Scope and aims of this white paper. The main purpose of this paper is to provide a practical overview of the legal aspects of Big Data management and governance projects. In order to illustrate how Big Data and BIA software are beginning to have real impact and provide context for the discussion that follows, Section B briefly overviews Big Data initiatives and potential in a number of different vertical sectors (financial services, insurance, healthcare, air travel, music and public sector). The focus is then on providing three 'views' of Big Data from the legal perspective:

  • Section C offers a common legal analytical framework for Big Data, centred on intellectual property rights in relation to data, contracting for data and data regulation;
  • Section D considers Big Data within the organisation from the standpoint of input, processing and output operations; and
  • Section E overviews the key aspects of Big Data management projects from the perspective of governance, addressing risk assessment, strategy, policy and processes/procedures.

The Legal and the IT Groups are likely to be the two business functions most closely associated with an organisation's Big Data management project. This paper addresses primarily the issues that will be relevant for the Legal Group rather than the IT group, but data modelling is addressed in outline at Sections B and D in view of its central importance. Detailed discussion of the technical aspects of data law and the detail of Big Data governance is outside the scope of this paper, but references are provided14 to further materials where these aspects are discussed at greater length.

To read this White Paper in full, please click here.

Footnotes

1. http://www.ft.com/cms/s/3/525236ca-fd4f-11e3-bc93-00144feab7de.html?siteedition=uk#axzz35vtpzx2A

2. http://www.gartner.com/technology/reprints.do?id=1-1QHKSEP&ct=140206&st=sb

3. Towards a thriving data-driven economy (COM(2014) 442 Final) at https://ec.europa.eu/digital-agenda/en/news/communication-data-driven-economy

4. http://www.gartner.com/technology/topics/big-data.jsp

5. http://globaltrends2030.files.wordpress.com/2012/11/global-trends-2030-november2012.pdf.

6. At page ix.

7. At page 85.

8. See Kemp et al, 'Cloud computing: the rise of service-based computing' in Practical Law - http://uk.practicallaw.com/2-385-1280.

9. See Kemp, 'Mobile payments: current and emerging regulatory and contracting issues' (29 CLSR [2], pp. 175-179), or Practical Law at http://uk.practicallaw.com/3-523-4318?q=mobile+payments.

10. Big Data: Seizing Opportunities, Preserving Value', http://www.whitehouse.gov/issues/technology/big-data-review. The report focuses on 'how big data will transform the way we live and work and alter the relationships between government, citizens, businesses, and consumers'.

11. Liran Einav and Jonathan Levin, "The Data Revolution and Economic Analysis," Working Paper, No. 19035, National Bureau of Economic Research, 2013, http://www.nber.org/papers/w19035; Viktor Mayer- Schonberger and Kenneth Cukier, Big Data: A Revolution That Will Transform How We Live, Work, and Think, (Houghton Mifflin Harcourt, 2013).

12. National Science Foundation, Solicitation 12-499: Core Techniques and Technologies for Advancing Big Data Science & Engineering (BIGDATA), 2012, http://www.nsf.gov/pubs/2012/nsf12499/nsf12499.pdf.

13. at page 4

14. For a more detailed review of the technical aspects of data law see Kemp et al, 'Legal Rights in Data' (27 CLSR [2], pp. 139-151), or Practical Law at http://uk.practicallaw.com/5-504-1074?q=Big+Data+Kemp.

For a more detailed review of governance in a related area – Open Source Software – and points for consideration in strategy and policy statements and processes/procedures, see Kemp, 'Open source software (OSS) governance in the organisation' (26 CLSR [3] pp. 309–316), or Practical Law at http://uk.practicallaw.com/3-501-0318?q=open+source+governance.

The content of this article is intended to provide a general guide to the subject matter. Specialist advice should be sought about your specific circumstances.