Following on from my introduction to GDPR for Developers and a look at how Data Subject rights may be enabled from an implementation perspective, in this article I’m examining how tech teams can approach working towards GDPR compliance from a systems, network and hardware infrastructure point of view. A lot of this isn’t rocket science. In many cases it’s common sense… but as the old saying goes… common sense just isn’t all that common.
2017 has been a bumper year for massive data breaches. Big names like Uber and Equifax crowded our newsfeeds with frankly sometimes bizarre revelations about the circumstances of the breach and their bumbled responses to it. For the first time, with GDPR, there will be an enforceable regulatory framework in place to oblige organisations to do better, to protect their customers and by extension themselves from harm, both financial and reputational. At the core of this is the concept of Privacy by Design, realised in the GDPR in Article 25 as Data Protection by Design and by Default.
Privacy By Design
Privacy by Design is a term originally coined by the former Information and Privacy Commissioner of Ontario, Dr. Ann Cavoukian and is encapsulated in its seven foundational principles.
These guiding principles lay the groundwork for, what for many businesses, will be a culture-shift in how they plan, design, implement, test and operate the applications and services that they develop to be used by their customers. The Privacy by Design principles promote a privacy-first, user-centric and security-focused design.
They advance the view that the future of privacy cannot be assured solely by compliance with regulatory frameworks; rather, privacy assurance must ideally become an organisation’s default mode of operation.
Where to start – pick the low hanging fruit
If you haven’t been practicing these principles for years, then there are two things you need to do. First is a review of current systems, assess the technical debt that you have now, and make a plan for mitigating the risk to your systems as they currently stand. The second is to ingrain a Privacy by Design and Default process, awareness and culture into your organisation. No small task, but as with any journey, it starts with that first step.
A good place to start implementing a Privacy by Design approach is to measure your current systems against the Open Web Application Security Project (OWASP) Top Ten and assess your exposure.
It is just plain best practice to secure the information in transit between you and your users. SSL certificates have become affordable (they can even found for free) in recent years, and if you are not securing your site using SSL at this stage, it is a big red flag for an increasingly well-informed public that you are not doing even the basics needed to secure their personal information. I’d recommend reading the Happy Path to SSL by security expert Troy Hunt for a good primer on this. Be sure to look at the extra steps you can take beyond simple SSL setup such as HSTS configuration.
Your Environments - Development, Testing, Staging & Production
How are you securing your environments? Do you limit the people who have access to them as they graduate from Development to Live? How do you manage the credentials needed in the different environments? Do you use personal information anywhere other than Production? (you definitely shouldn’t be!).
A litmus test for whether an app has all config correctly factored out of the code is whether the codebase could be made open source at any moment, without compromising any credentials.
Are API keys and other credentials stored in your source code repositories? Who has write permissions on those repositories? The Twelve-Factor App is a good methodology to follow for building apps that conform to best practices for separating credentials from the codebase. As noted in its Config chapter “A litmus test for whether an app has all config correctly factored out of the code is whether the codebase could be made open source at any moment, without compromising any credentials.”
Access & Credential Control
Who has access to the personal information that you collect? Is access limited to those that need it? How do you secure access to services and applications that developers need to do their job? Are credentials that are needed to access external services stored securely? Do your developers use password managers and key vaults? How do you store your SSH Keys, API Keys, DSN Keys, Public Keys etc?
Access control should be as granular as possible. You should avoid scenarios where there is an all or nothing access control policy. You should also be logging who accesses personal information.
Some resources for key and password management:
Key Management Services:
Password Management Tools:
2 Factor Authentication (2FA) has been rolling out the last few years among many cloud providers. It provides an extra layer of security, requiring you to combine something you have (e.g. a code delivered via your phone) with something you know (i.e. your password). If applicable, you should be turning it on for any SaaS that you use, and incorporating it into those that you develop. Some are going so far as to provide 3FA by incorporating a biometric component for verification.
Data Flow Mapping
If it hasn’t been done before or perhaps there has been a gap since the last time an exercise like this was completed, during which new features have been added or new data collection/processing initiatives undertaken, it is a crucial exercise in understanding how data that is collected through your applications, flows through your systems, gets duplicated, transformed, exported and stored. It is also useful for understanding and classifying the types of data that you collect. If any of that data is considered sensitive data (i.e. is information pertaining to racial or ethnic origin, political opinions, sexual orientation, religious or philosophical beliefs or trade union membership, genetic data, biometric data, health data), then your company will have heightened obligations under GDPR.
Data Flow Mapping can provide a good starting point for illustrating where you need to begin your focus on implementing Data Protection by Design. Developers are well-placed to be involved in this process as they should be intimately aware of how the data flows through the business. Ideally, the audit itself is carried out by an external auditor (performing a Data Protection Audit) as this can avoid a conflict of interest that can arise when auditing your own systems.
Having an honest broker perform the audit means that it’s more likely that the hard questions will be asked, and you’ll end up with a better result. It is also preferable from an accountability standpoint if in the future your company is inspected by the Data Protection Authority, to demonstrate that you made best efforts to have a rigorous, non-partisan review of your systems.
Data Flow Mapping may be carried out as part of a wider Data Protection Audit. There are many templates available to guide you in carrying out one of these. There are also innovative services such as Eurocomply - http://eurocomply.com/ that help you streamline the processes of conducting a Data Protection Audit.
It is not developers or technical teams directly who should carry out or lead these audit exercises, but they should be closely involved in them, and if they are not happening, they should be asking questions as to why not.
Data Storage – Not just for databases
Consider your data storage in these areas:
- File Storage
- Log Storage
- Backup Storage
You should take a risk-based approach where you classify your data and protect it accordingly. You need to actively consider which data to keep or even better if you need to store it in the first place.
Are your databases configured to protect the personal information that they contain? Is the information encrypted in transit and at rest? Is there data duplication that could be reigned in? Do you store more data than you need? Data Minimisation is a core principle of Data Protection.
Databases aren’t the only places that personal information may be stored. If personal information makes its way through your organisation through email for example, you should rethink how this is done and provide alternatives to employees to move around that type of data.
Other questions you should be asking:
- Do your employees use spreadsheets or other uncontrolled means to store personal information?
- Do your systems log personal information to application logs (pro tip: don't!)
- How are those secured and does your data retention policy cover them?
- Do you limit who has access within your organisation to personal information?
- Do you have backups?
- Are they encrypted?
- Does your Data Retention policy trigger their deletion?
Data Retention - use it or lose it
How long do you keep personal information and what is your process for retiring & deleting the data?
You should only retain your customers’ data for as long as it is required to do so for you to provide them a service or to meet your obligations under the law.
If the legal basis that you have collected the personal information is based on consent and you don’t periodically contact your customers, while at the same time giving them the option to opt out of future contact, you need to consider that your data retention policy may need to be enacted on data that you have not demonstrably needed to use over a reasonable period. You may be able to make a business case for retaining that data for, on average, 12 months but beyond that it could be difficult. If you have collected and retained information on a legal basis other than consent, these periods can differ. A good rule of thumb is for data that you no longer have a business use for and for which consent has not been obtained (or another legal basis used) within a reasonably recent timeframe, it should be deleted.
Anonymisation & Pseudonymisation
Anonymised data is not personal information and as such the data protection principles do not need to be applied to it. If you can achieve the same results with the anonymised data, it is certainly preferable to do so. Remember though – until you delete the original source data, the anonymised data is not considered to be truly anonymised if using both data sets individuals could then be identified in the anonymised dataset.
Pseudonymisation involves de-coupling identifying data from the dataset, usually by means of identifier key references. This should be a key part of your Privacy by Design strategy, enabling you to lower the risk to that data and the individuals you collected it from. Pseudonymised data is still personal information and needs to be treated as such. With the pseudonymisation process, however, you have lowered the risk to that data.
Adding New Features – Privacy by Default!
The second part of Privacy by Design is Privacy by Default. And what that means when implementing new features into your application is that you should not automatically opt your users into them. They should be informed and given the option to enable them. This should get teams thinking about how to best inform and encourage users to flip the switch and choose to use these new features for the benefits that they could bring them. Big players like Facebook and Google have been moving towards this model for a while now and you’ve likely seen the notifications pop up telling you about these fancy new features that you can start using now or perhaps “remind me later” after you’ve had the chance to think about it. Crucially, they also allow the user to opt out of using them.
Data Protection Impact Assessments (DPIA's)
With the introduction of new features and the evolution of your platforms, you should be carrying out Data Protection Impact Assessments if the features that you are adding or the initiatives you are undertaking are “likely to result in a high risk to the rights and freedoms of natural persons”. In this case DPIA’s should be carried out to assess new processing operations or similar sets of new processing operations that are likely to result in a high risk to those rights and freedoms. The GDPR does not mandate that DPIA's need to be carried out if there may be a risk, but rather if there is likely to be. It is also worth noting that if you are unsure if you should carry one out, the guidance is to go ahead and perform one anyway as it is a useful tool to comply with Data Protection Law.
One such tool to help you do this is the AvePoint Privacy Impact Assessment tool.
Privacy by Design and GDPR accountability as a whole can be challenging for small businesses. With smaller budgets they have to be nimble to innovate and compete. This can result in a patch-work of cloud services strung together to realise their applications. Each uncontrolled or not fully understood entity that is a part of your application that processes personal information is a potential risk to you achieving the security that you are striving for.
Do you expose any of the personal information that you hold via an API to be consumed by others? Make sure that you have a legal basis to share that information and that who you are sharing it with will treat the information in a GDPR-compliant fashion.
Review any third-parties that you use to process the data that you collect. Do not assume that they are compliant. Where are their servers located? What is their Data Protection Policy? You should have a clear contractual relationship with your processors in how they can use the data that you provide them. Security & Data Retention both need to be considered from this angle as well. Should you need to delete data, you have an obligation to inform any third-party processors that you have passed that data on to, to delete it also.
The challenge and opportunity of the Cloud
While the cloud has enabled small business to achieve much by using dedicated cloud-based expert services to perform processing as part of their wider application, as noted with third-party processors, it does become a challenge when you are trying to reign in how the data you manage is used, to be accountable under GDPR. Cloud providers are rising to this challenge and so it is important for you to take a look at how the big providers may be able to give you some uniformity and confidence in data management and security if you were to re-organise how some of your processing is currently done.
Providers like Microsoft Azure have matured at lightspeed over the last few years and it is now possible to achieve much within the array of services that they offer that in years past you may have had to achieve through combining multiple other providers. By bringing your data under the central control of one provider, you are making it easier for yourself to have confidence in how that data is treated. You have one contract to maintain and you have consistency and certainty about a crucial surface of exposure for your product, your customers and their personal information.
Cloud providers have been preparing for GDPR for years and are making available suites of tools to help their customers become accountable under the regulation. Having your data managed by a provider that layers on these expert tools gives small business a strong foothold in taking ownership of the GDPR challenge and demonstrating accountability. Take some time to assess a cloud provider that you may be using currently, and see if there is scope for your business to bring more of its processing and operations under that umbrella.
Some tools offered by Microsoft around GDPR compliance are:
- Microsoft Compliance Manager
- Azure Information Protection Scanner
- Azure Advisor
- Azure Security Center
- MS Threat Detection Tool
- MS GDPR Assessment Tool
Your Network and Infrastructure
Your system's security is only one link in the chain. A secure and robust network and hardware infrastructure in place with policies and ACL’s enforced to ensure that only the right (and a limited number of) people have access to personal information is crucial.
Data Breach Monitoring Software (DLP Software) can play a part in securing Personal Identifiable Information (PII). It is essential for your network and hardware admins to perform regular security audits.
In this sense the GDPR is quite practical. You should assess the risk and mitigate it to the best of your abilities
Companies can be overwhelmed by what is perceived as the onerous obligations and potential spiraling costs of reaching what may be considered compliance. The GDPR itself, however, states that you should take into account the state of the art, the costs of implementation as well as the nature and scope of the processing that you are performing in deciding what protective measures to implement. It is in effect a balancing act between the resources that you have available and the risk to your customers should their personal information be compromised. In this sense, the GDPR is quite practical. You should assess the risk and mitigate it to the best of your abilities, documenting your rationale as part of the process. You don’t have to achieve 100% of what you can do in a big-bang rollout, but you should be making steady progress towards those goals.
Bring your own Device (BYOD)
Often, companies either through lack of planning or lack of resources, rely on their employees to use their own devices as part of performing their job. This Bring Your Own Device policy is a big risk to meeting obligations under the GDPR. It is a Wild West approach to obligations and should a data beach happen as a result of a BYOD policy that did not have the proper safeguards in place to mitigate risk, it is not likely to be looked favourably on by the Data Protection Authorities when they are levying fines in response. Devices used to process personal information in the course of employees performing their jobs should be secured and maintained to the best of your abilities.
Hosting - Location, location, location.
What’s been a mantra in the physical world is becoming ever more repeated in the digital one. Where your servers are located is important under GDPR. Are they located in the EU? If not, is where they are located covered by an adequacy or other contractual agreement that is acceptable under GDPR? The aim is to protect citizens data. We’ve seen instances of governments demand access to records and personal information held by hosting companies with increasing regularity. Where the data is when at rest is an important part of ensuring Privacy by Design.
Not all Data Centres are equal
Some Data Centres (such as Azure with their German Data Centre) are providing their customers with an extra level of security where a separate data trustee company controls access to the physical servers within the Data Centre.
Automated Decisions and Profiling
GDPR is technology agnostic and recognises that technology can outpace regulation that gets lost in implementation specifics. It makes efforts to ensure that automated decision-making and profiling are not used in ways that can have an unjustified impact on individuals’ rights and requires:
- specific transparency and fairness requirements
- greater accountability obligations
- specified legal bases for the processing
- rights for individuals to oppose profiling and specifically profiling for marketing
- if certain conditions are met, a need to carry out a data protection impact assessment.
Broadly speaking, profiling is the collection and analysis of personal information to place an individual into a certain category or make predictions about them.
Automated decision-making is the ability to make decisions by technological means, without human involvement. Automated decisions may be made without profiling, but they are not necessarily mutually exclusive.
The GDPR makes a distinction between solely automated decision-making including profiling and decision-making based on profiling that includes a human component to the decision that may take into account other mitigating factors. The human element has to be meaningful, and not simply there to ‘rubber-stamp’ the automated decision.
If your systems produce automated decisions that may have a legal or similarly significant effect on your customers, then you should have as part of that process the ability to include human intervention in the decision-making process. Instead of your customer having to actively object to you carrying out such processing, you should only carry it out under the following circumstances:
- It is necessary for entering into, or performance of, a contract between your customer and your company.
- It is authorised by Union or Member State law to which your company is subject and which also lays down suitable measures to safeguard your customers' rights and freedoms and legitimate interests (this will usually apply if you have a legal obligation to perform such processing).
- It is based your customers' explicit consent.
Where a decision is made that your customer objects to, you should facilitate, at a minimum, their right to contest the decision and express their point of view.
You are further limited in producing automated decisions that are based on the processing of special categories of data.
In a world of increasing availability and accessibility to tools and services around AI and Machine Learning that enable us to make automated decisions and perform profiling at scale, it is crucial that organisations take their obligations under the GDPR seriously in how they conduct themselves regarding automated decision-making. Ambitions need to be balanced against the rights of the people that we are serving and we shouldn't lose sight of our obligations to them.
I’ve talked about the need to collect and record consent in my previous post. If consent is the legal basis that you are collecting personal information under, it should be gained for each purpose that you are processing personal information. It should also be as easy t0 withdraw as it was given.
You should be keeping a register of the purposes that you process data that you can tie to the consent that you are collecting when personal information is submitted to you. Having a managed register makes it easier to be transparent and accountable, and to keep track of the reasons that you are collecting this information.
There are products available that offer consent management as a service that are worth looking at:
Can I treat non-EU citizen data differently?
The GDPR applies to EU based companies and companies that collect data of EU citizens, regardless of a physical presence in the EU. For example, if you’re a US-based company that supplies goods or services to EU citizens and US citizens, you could potentially apply a different treatment to the processing of personal information depending on which category your customer fell into. But my bet is that it is cheaper to apply the same policies across the board than to introduce and manage different data regimes for different categories of customers. In effect, non-EU based citizens may also benefit from the introduction of the GDPR.
GDPR is not Y2K
You might experience an attitude in your company of comparing the upcoming enforcement deadline for GDPR as akin to the Y2K countdown. It is, in so far as there are some media and businesses capitalising on the fear of those who are unprepared. It is not in so far as the millennium bug was just that – a bug, and the GDPR will be law as of May 25th 2018. You may experience the belief that there will be a grace period – they’re right, this is it. May 25th 2018 is the enforcement date for GDPR, companies should be preparing for it prior to that.
I expect Data Protection Authorities to be practical and proportional when investigating breaches of GDPR compliance. If your company is making honest efforts towards an accountability roadmap, this will go a long way towards demonstrating the organisation's commitment to its goals. One thing I can say for certain though is that to stand still, keep the head down and hope for the best is absolutely not the responsible thing to do for your customers or your business.
Technical teams most important actions now
I’ve discussed several measures in this post to start assessing your technical risk. To sum them up:
- Conducting a Data Flow Mapping exercise and classify the types of data you are collecting is a good first step towards understanding at a high level what your company is doing now and to highlight the gaps that need closing.
- Apply the 7 Principles of Privacy by Design to how you engineer software.
- Measure your systems against the OWASP Top 10.
- Look at how you manage your code repos and deployment practices.
- Secure your data at rest and in transit.
- Ensure that you have appropriate access controls for Personal Information.
- Ensure you have a Data Retention Policy and that it is enforced.
- Implement reviews and privacy impact assessments in response to changes to how you process personal information.
- Anonymise and Pseudonymise data where you can to mitigate risk.
- Review your third-party processors.
- Take a look at how employees are using their own devices to process personal information for work purposes.
- Review your data hosting arrangements.
- Look at how you currently implement automated decision-making and profiling.
- Assess the basis for which you are currently processing personal information. If it is based on consent, but you have not gained consent for the specific processing purposes, you may need to initiate a consent renewal project.
Data Protection by Design needs to be part of company culture
Data Protection by Design needs to be fostered through training, mentorship and certification. Maintaining those skills as technologies and best practices evolve and effectively onboarding new employees to build on that foundation is crucial. GDPR accountability is achieved through the three pillars of people, processes & technology and technical teams are central to realising the challenge and helping their organisations succeed at implementing it. It’s going to be an interesting year ahead!
General GDPR Resources
Privacy & Security Management Software
- Microsoft Compliance Manager
- Azure Information Protection Scanner
- Azure Advisor
- Azure Security Center
Frameworks & Tools to help with GDPR Compliance
- IBM Security GDPR Framework
- Nymity Risk and Control Checklists
- HP Accountability Model Tool
- AvePoint Privacy Impact Assessment Tool
- MS Threat Detection Tool
- MS GDPR Assessment Tool
Consent Management Tools
Relevant ISO Standards for Privacy by Design
- ISO/IEC 27001 - Defines the mandatory requirements for an Information Security Management System (ISMS).
- ISO/IEC 27002 - Standard of good practice for information security.
- ISO/IEC 27003 - Provides guidance for those implementing the ISO27k standards
- ISO/IEC 27010 - Information security management for inter-sector and inter-organisational communications
- ISO/IEC 27018 - Code of practice for protection of personally identifiable information (PII) in public clouds acting as PII processors.
- ISO/IEC 29151 - Code of practice for personally identifiable information protection
- ISO/IEC 19770 - Standards for IT asset management