Skip to content

Big Data and Privacy: a short review of GDPR


Mario Bojilov


Big Data is progressively changing the world. The large volume of data collected can bring enormous value to the companies, such as cost reductions, time savings, optimized offerings and smart decision making. However, there is no rose without a thorn; Big Data misuse poses some serious privacy risks. These include data breaches (information is accessed without authorization), data brokerage (sale of incorrect data) and data discrimination (unfairness due to demographics, usually). 

Since 1995, several regulations have been established in the European Union in order to protect consumers’ data and ensure the transparent use of data. However, with the development of Big Data, the amount of data daily collected from consumers tremendously increased. To put it in perspective, Internet users generate about 2.5 quintillion bytes of data every day and the Big Data analytics market is set to reach $103 billion by 2023. Such progress in Big Data is more than exceptional, and, therefore, it required updated and stronger laws. Thus, this article digs into the biggest threat brought by Big Data, privacy risks, and explores how the General Data Protection Regulation, the newest regulation on data privacy, helps ensure data protection. 

How GDPR came into existence

Data privacy and protection is a topic that has gained substantial attention in the last few years, mainly because massive irregularities have happened, such as collecting, using and sharing personal information without notice or consumer consent. One of the biggest scandals, related to Facebook and Cambridge Analytica, must definitely ring a bell. A couple of years before the scandal (2013), researchers from University of Cambridge published a paper explaining how an individual’s personality can be predicted through a user’s activity on Facebook. The scientists warned that these predictions could “pose a threat to an individual’s well-being, freedom, or even life”. However, little attention has been given to those findings, until the accusations that Cambridge Analytica (April 2018) used the data to provide analytical assistance to the 2016 presidential US campaigns and to the Brexit referendum- and we all know how that ended.  

In this context, as of May 25th, 2018, all the member states of the European Union became obligated to comply and harmonize all the data privacy laws across EU and EEA regions with GDPR – General Data Protection Regulation. The regulation was an essential step to continue consolidating individuals’ fundamental rights for data protection in the digital age and facilitating business by clarifying rules for all the parties involved. To be more specific, GDPR was a continuation of a long series of regulations related to data protection within the EU, but harsher in terms of fines and stricter in terms of rules. It replaced the 1995 Data Protection Directive, adopted at the early stages of the Internet. Here you can visualize the timeline of European regulations before the implementation of GDPR. 

However, it is important to note that even though GDPR is enacted in the EU, it is not EU-centric. The impact of GDPR expands all around the world, since its intent doesn’t cover only EU-based companies. Article 3.1 indicates that any company processing EU citizen data (regardless of its location) will have to comply with GDPR.

GDPR Principles

So, what exactly does the regulation of Data Protection entail?

General Data Protection Regulation has set new standards on how to collect, store, and use customer data. According to GDPR, personal data is any information related to an individual, such as name, photo, email address, bank details, updates on social media websites, location details, medical information, or a computer IP address.

The GDPR encompasses  8 principles

  1. The right to access: individuals can request any company to provide a copy (free of charge) with the personal data that the company has on a particular person. 
  2. The right to be forgotten: at any time, consumers can request the company to delete their personal data 
  3. The right to data portability: data can be transferred from one service provider to another through a commonly used and machine-readable format. 
  4. The right to update information: individuals can have their information corrected, if it’s incomplete or incorrect. 
  5. The right to restrict processing: consumers might request that their data should not be processed.
  6. The right to object: with no exceptions to the rule, individuals have the right to stop the processing of their data for direct marketing. 
  7. Information security: if a data breach happens, individuals have a right to be informed within 72 hours of the incident.

GDPR Challenges 

Despite the clear values that GDPR stands for and all the benefits it brings to the consumers’ individual rights, there are several issues that raised a lot of debate around it.

First of all, according to the Global Forensic Data Analytics Survey, only 33% of companies have a concrete plan for GDPR compliance, whereas 39% are totally unfamiliar with the regulation. In other words, even if the regulation is obligatory, it’s not entirely followed. Additionally, a lot of companies – particularly in the US – might not even be aware of the regulations and the changes they have to implement. But given that the law applies to all EU citizens, in today’s global world, it’s almost impossible to avoid dealing with some form of personal data from the European market. 

Secondly, another concern consists in the abundance of new requirements that firms have to adapt to. Companies are obligated to keep the internal record of data protection activities, notify individuals about data breaches, and take immediate action. 

Lastly, for some organizations, the companies have to appoint an official Data Protection Officer. This can be problematic because there are a lot of vague notions around the GDPR requirements. Terms like “undue delay”, “disproportionate effect”, or unclarity on the “reasonable” level of personal data protection, might cause confusion and are up for many interpretations.

At this point in time, according to the United Nations Conference on Trade and Development, 128 out of 194 countries have introduced legislation for securing data protection. In numbers, this means that : 

  • 66% of countries have legislation
  • 10% of countries have a draft legislation 
  • 19% of countries have no legislation 
  • 5% of countries have no data 

And make no mistake, non-compliance with GDPR can cost your company a fortune, roughly 4% of the turnover. To be precise, article 83.5 states “administrative fines up to 20.000.000 EUR, or…up to 4% of the total worldwide annual turnover, whichever is higher”.

Even though governments have made great progress in regulating collected data, the numbers above show a precarious situation for the highly digitalized world we live in today. That is why companies should take a serious approach towards compliance to GDPR, by following some steps:

  1. Access all the company’s data sources and be aware of all types of data the company is handling.
  2. Identify data with personal information. 
  3. Limit access to personal data only to job roles that require it. 
  4. Protect data by encryption, pseudonymization, and anonymization.
  5.  Audit your files. Companies need to know what personal data they have, where it is stored, how, by whom, and for what purpose is managed. 

Concluding Thoughts

Our digitalized world has facilitated the flow and quantity of information. It is important to remember that Big Data is an enormous advantage for any company and is only a risk if it is managed poorly. That is why it’s extremely necessary to ensure that your company is following the principles of GDPR.

If you are interested in understanding how Big Data works and how to ensure that your data is protected, you should take a look at our Big Data Foundation™ certification. This course is essential for anyone interested in gaining the Big Data knowledge required to create value in real-world applications. This is a fundamental course with practical exercises designed to provide you with hands-on experience in two of the most popular technologies in Big Data processing – Hadoop and MongoDB.


Never miss an interesting article

Get our latest news, tutorials, guides, tips & deals delivered to your inbox.

Please enter your name.
Please enter a valid email address.
Please check the required field.
Something went wrong. Please check your entries and try again.

Keep learning


Leveraging Big Data Analytics to Adapt to Disruption

Big Data analytics' contribution to identifying hidden patterns, relationships, and client inclinations is long-established. There is no lack of examples anymore on how it has...

Learn The Basics of Big Data Before Becoming Certified

Thanks to the advancing technology over the years, it's now easier to collect data and store it. In this way, the generated and collected data...

The Big Data Approach to Hiring Professionals

No company in the world succeeds without the right people. Until a fully capable AI-driven company comes along in the heart of it all, it...
Scroll To Top