Knowledge Byte: The Privacy and Ethics of Big Data
For all its potential uses and capability to enact changes, Big Data also raises a number of privacy and ethics-related questions that ought to be addressed. Reviewing its relationship with compliance can be an important first step in exploring this topic.
● PCI DSS, HIPAA (US), Data Protection Act (UK)
● Review compliance with the above legislation in relation to Big Data
● Big Data Privacy Review commissioned by White House
Various types of legislations are relevant to privacy in different jurisdictions. In the UK, the Data Protection Act is the main mechanism used to deal with privacy and its violations. In the US, the legislation dealing with medical records and privacy is HIPAA. On the other hand, PCI DSS is an international set of requirements applicable in all places where credit card transactions are processed. It needs to be noted that in countries with state (territory)/federal structure, for example, Australia, Canada, the US, India, and so on, there may be privacy-related state legislation, in addition to applicable federal acts.
Big Data privacy review was commissioned by the White House in 2014. The review provides a set of recommendations in relation to Big Data and privacy:
● Pass National Data Breach Legislation
● Extend Privacy Protections to non-US Persons
● Amend the Electronic Communications Privacy Act to ensure a similar standard of protection of data in physical and online worlds
● Complex IT environment – Not well understood and still, mainly tech-driven, consequently less IS audit oversight.
● The number of Big Data solutions are sitting outside IS – Residing in business areas and being used for experiments.
● Higher risk than usual:
- 000,000s vs 000s of records/transactions
- New insights are generated
● Reidentification not always effective:
- 85% of people in the US can be identified using publicly available information – ZIP, DOB, and sex.
- > 50% from city, DOB, and sex.
● Bring the privacy issues to CxO/Board’s attention – Use COBIT5 Principle 1 – Meeting Stakeholders Needs, EDM 1.01-1.03.
● Anonymize the data quickly.
● Ensure “new” data is covered by policies:
- Systems processing these datasets need to be covered in the audit plan.
● Governance/risk/audit function needs to provide education to business users on the risks associated with privacy and Big Data.
Data anonymization is the process of encrypting or removing personally identifiable information from data sets so that the people whom the data describe remain anonymous. Data anonymization enables the transfer of information across a boundary, such as between two agencies, while reducing the risk of unintended disclosure.
Race and gender are sensitive and should be used carefully in Big Data projects. In some cases, it may even be illegal to do. For example, age profiling when evaluating a potential customer in a bank can be illegal. At the same time, in medical situations, it may be necessary. Also, implied race can be a problem. For example, certain postcodes are associated with some nationalities. In such cases, ethical issues may arise because using postcodes is very closely related to using nationality.
Related products to help you upskill
The industry-recognized CCC Big Data Foundation gives learners the opportunity to practice the installation of Hadoop and MongoDB through hands-on lab exercises. The exercises expose you to real-life Big Data technologies with the purpose of obtaining results from real datasets. This practical knowledge is sure to help you jump start your Big Data journey.
Never miss an interesting article
Get our latest news, tutorials, guides, tips & deals delivered to your inbox.