GDPR Compliance: Technical Requirements for Linux Systems
What is GDPR?
GDPR or General Data Protection Regulation is a regulation to protect personal data from citizens of the European Union (EU). When speaking about stored data, it includes the handling of data at any given time, from the initial creation of the data, until the final deletion of it. One of the important parts is the right to ‘know’. That means that individuals can ask what data is stored about them. Another request they may make is that this data is deleted. You may know this from the “right to be forgotten” which already applied to Google for some years. The GDPR applies to all companies that store personal data from EU citizens. So even if you are based in the US, a happy shopper from the EU will get you in scope.
The challenge with regulations like GDPR are the steps you could take on a technical level. While most of the policy makes sense, the translation to action technical implementations steps is nowhere to be found. We created this blog post to get you started with some best practices for Linux systems.
For who is GDPR applicable?
If you store personal information about citizens from the European Union, GDPR applies to your organization. Organizations that are located in EU, typically have at least personal information about their personnel. If you provide services or products to individuals, the most likely you will also have EU citizens in your database.
In that case, you have to take additional measures to protect sensitive personal data. Here are some examples of parties that usually will have to be compliant with GDPR:
- Association with memberships
- Dating sites
- E-commerce
- Forums
- Hosting companies
- Marketplaces
- Sports club
- Web shops
As you can see, both commercial and non-commercial entities will have to comply.
Technical requirements for GDPR
All data starts with the point of entry (creation) and ends with its deletion. In between, there is the transportation, processing, and storage of it. With this regulation, it is not that easy to mandate specific technical controls. The regulation itself deals with safeguarding personal data. Unfortunately, it does not explain how to do this. For that reason, it is good to use the security pillars: Confidentiality, Integrity, and Availability. By applying technical measures to ensure these three principles, we can get closer to complying with the rationale of the GDPR.
Security scanning
As there are so many actions that you can take, the first step is getting to know your systems. Even if you already applied some system hardening in the past, there is still a lot to learn by taking regular measurements. These measurements can be done with Lynis, the open source security scanner that Michael Boelen created in 2007. Despite its age, the project is still maintained and light on its requirements to run. Lynis will measure the security defenses on a system and propose room for improvement. Taking regular measurements (daily!) has several benefits. One is that you will learn about the things that can be improved, even if your network or IT landscape changes. The second benefit is that you can show proof that you do regular testing. If you still have a breach, then at least you can’t be blamed for not performing these regular tests.
Auditing and Events
One of the important topics in GDPR is dealing with breaches. Systems are as safe as their weakest link, and most likely there are multiple weak links in each network. So how well you try to protect, one should consider that a breach may happen eventually. To detect a possible breach, logging should be configured in the first place. Most Linux processes have this enabled by default, but tuning might be needed. Important areas include failed login attempts. This includes attempts on the console, via SSH, and also for applications that offer authentication.
Besides logging the need for proper auditing has been increased over the years. In the event where an investigation is needed, you might want to have full details. For example, on what exactly happened on each system. This can be achieved with the Linux audit framework.
Implementation tips for Linux (auditing and logging)
- Implement the Linux Audit framework and monitor for suspicious events
- Set up remote logging, to ensure log files are available and can’t be erased by attackers
- Use a central management interface to collect logging and apply the first level of automatic filtering
Availability and Backups
When we think of availability of data, the first thing coming to mind might be high-available (HA) software solutions. While that helps with high service uptime, it does not much to protect data in itself. Backups are from a technical point of view more interesting. It starts with creating the backups (safely) and protect them as good as your original data. Your (next) backup solution might need to have a cryptographic library, to encrypt the data. The backup data should be only readable by those having the unlock key.
One aspect of backups is often skipped: the restore. And as we know, your backup is as good as your restore. If you can’t restore data, your backup is worthless. You can only know how good your backup is by doing regular restores. Consider this a requirement for your backup solution as well, like having the option to perform automatic restores.
Network filtering and firewalls
Data should only flow to places where it really needs to be. Most companies already use network firewalls, yet they don’t filter traffic between systems in the same network segment. This is a serious risk, as the intrusion of one single system can result in more systems to be breached.
The deployment of iptables on Linux systems can be a simple solution to contain data streams to a bare minimum. Depending on the role of the system, allow the protocols related to the services that should be reachable. On top of that, open up the generic management protocols (port 22 for SSH, the ports for monitoring, etc).
Best practices for network filtering and firewalls
- Use “default deny”
- Keep the firewall updated
- Log sensitive data streams
- Perform regular audits of firewall configurations
- Mark exceptions properly, with an end date or review date
Software patch management
Almost every software package on this planet has flaws. Fortunately, most of these so-called bugs do not have a huge impact. A small percentage of bugs result in a security issue that can be misused. These are the ones that we know as software vulnerabilities. Almost any Linux distribution has a way to provide software and patches.
The first advice is to have a process in place to test and deploy security patches. Where possible use central solutions that help with deployment and automation. A good example is Red Hat Satellite for RHEL, or Canonical Landscape for Ubuntu systems. If you don’t use these, then at least script the deployment of security patches, or leverage a tool like unattended-upgrades.
Best practices for software patching
- Using staging for testing software
- Deploy software on a regular basis
- Apply security patches as quick as possible with automation
General GDPR principles and tips
The “data as cold coffee” principle
Most people don’t like a pot of old (and cold) coffee. When it comes to data, we tend to be on the cautious side and keep storing it for years. Like you shouldn’t heat up cold coffee, you should not keep data too long. Reduce the risks of storing sensitive data where you can. For example, delete data when there is no longer a real need to keep it.
The period to keep data differs and should be based on the underlying business purpose. For example, if you need to keep it for regulatory reasons, like accounting data and financials. The period could be multiple years in such case. For the purpose of forensics, some data might be useful for months. For example log files and events coming from auditd. So depending on the data, define a clear point from where on data can be safely deleted. So throw away data when possible.
Keep also hardware and storage in mind. Hardware and storage can contain old data. Proper decommissioning steps should be applied. One of them is secure wiping of data from removable disks and storage media.
Passwords, passwords, and passwords
We all know that strong passwords are better than “Welcome01”. Still, most systems and software allow you to choose weak passwords. Use a module like pam_cracklib or pam_pwquality to enforce the usage of strong passwords.
Besides strong passwords, consider using two-factor authentication. This means that you need two different forms of authentication prove your identity. The first one is the combination of your username and password. The second one could be a token generated on your mobile phone. You can use a project like Google Authenticator PAM. This pluggable authentication module uses the common Google Authenticator app. It can be used together with SSH and other forms of authentication.
For Linux systems, it is also a good idea to lock out people after a few failed attempts. This limits the risk of brute-force attacks. These type of attacks try continuously to log in. Another benefit is a lower number of events to deal with. Finally, it could be a good reason to watch for other types of attacks.
Accountability
Don’t use functional accounts for system administration. Instead, give each administrator their own account. This helps with accounting and keeps everyone accessing the system a little bit more honest. People tend to be more careful when making changes under their own name.
Secure protocols only
The usage of telnet and other plain-text protocols should be avoided. Use safer alternatives like SSH. Add encryption to those services that support it. One of them includes the protocol SMTP, which is used for sending emails. Even if not all mail servers may use encryption at this moment, the big hosters already turned it on. It helps against snooping and possibly also leaking sensitive information.
Do you have other tips or questions related to GDPR and technical requirements on Linux systems? Use the comments below.