GDPR Compliance: Technical Requirements for Linux Systems
GDPR for Linux systems
What is GDPR?
The General Data Protection Regulation is a regulation to protect data stored about individuals from the European Union. When speaking about stored data, it includes the handling of data at any given time, from entry to data deletion. One of the important parts is that individuals have the right to request the data stored about them and the right to get that data erased. You may know this from the “right to be forgotten” which already applies some years to Google when EU citizens request that. The GDPR applies to all companies that store personal data from EU citizens. So even if you are based in the US, a happy shopper from the EU will get you in scope.
The challenge with regulations like GDPR are the steps you could take on a technical level. While most of the policy makes sense, the translation to action technical implementations steps are nowhere to be found. We created this blog post to get you started with best practices that may apply to Linux systems.
For who is GDPR applicable?
If you store personal information about citizens from the European Union, GDPR applies to your organization. Typically all organizations that are located in EU store at least information about their personnel. If you provide services or products to individuals, the most likely you will have EU citizens in your database. Here are some examples of companies that usually will have to be aware of GDPR and take additional measures to secure personal data:
- Dating sites
- Hosting companies
- Market places
- Web shops
Technical requirements for GDPR
Protecting data starts with the point of entry, continues with the storage devices that store it, and ends with the confirmation that all copies are finally been deleted. The security pillars Confidentiality, Integrity, and Availability are useful to translate the GDPR policy to technical implementation steps.
Auditing and Events
One of the important topics in GDPR is dealing with breaches. Systems are as safe as their weakest link, and most likely there are multiple weak links in each network. So how well you try to protect, one should consider that a breach may happen eventually. To detect a possible breach, logging should be configured in the first place. Most Linux processes have this enabled by default, but tuning might be needed. Important areas include failed login attempts. This includes attempts on the console, via SSH, and also for applications that offer authentication.
Besides logging the need for proper auditing has been increased over the years. In the event where an investigation is needed, you might want to have full details on what exactly happened on systems. This can be achieved with the Linux audit framework.
Implementation tips for Linux (auditing and logging)
- Implement the Linux Audit framework and monitor for suspicious events
- Set up remote logging, to ensure log files are available and can’t be erased by attackers
- Use a central management interface to collect logging and apply a first level of automatic filtering
Availability and Backups
When we think of availability of data, the first thing coming to mind might be high-available (HA) software solutions. While that helps with high service uptime, it does not much to protect data in itself. Backups are from a technical point of view more interesting. It starts with creating the backups (safely) and protect them as good as your original data. One of the requirements of your (next) backup solution might be the presence of a cryptographic library to encrypt the data. While it may be harder to have your live database fully encrypted, the backup data should be readable only by those having the key to unlock it.
One aspect of backups is often skipped: the restore. And as we know, your backup is as good as your restore. If you can’t restore data, your backup is worthless. You can only know how good your backup is by doing regular restores. Consider this a requirement for your backup solution as well, like having the option to perform automatic restores.
Network filtering and firewalls
Data should only flow to places where it really needs to be. Most companies already use network firewalls, yet they don’t filter traffic between systems in the same network segment. This is a serious risk, as the intrusion of one single system can result in more systems to be breached.
The deployment of iptables on Linux systems can be a simple solution to contain data streams to a bare minimum. Depending on the role of the system, allow the protocols related to the services that should be reachable. On top of that, open up the generic management protocols (port 22 for SSH, the ports for monitoring, etc).
Best practices for network filtering and firewalls
- Use “default deny”
- Keep the firewall updated
- Perform regular audits of firewall configurations
- Mark exceptions properly, with an end date or review date
Almost every software package on this planet has flaws. Fortunately, most of these so-called bugs do not have a huge impact. A small percentage of bugs result in a security issue that can be misused. These are the ones that we know as software vulnerabilities. Almost any Linux distribution has a way to provide patches to solve such security issues in the form of updated software packages.
The first advice is to have a process in place to test and deploy security patches. Where possible use central solutions that help with deployment and automation. A good example is Red Hat Satellite for RHEL, or Canonical Landscape for Ubuntu systems. If you don’t use these, then at least script the deployment of security patches, or leverage a tool like unattended-upgrades.
Best practices for software patching
- Using staging for testing software
- Deploy software on a regular basis
- Apply security patches as quick as possible with automation
General GDPR principles and tips
The “data as cold coffee” principle
Most people don’t like a pot of old (and cold) coffee. However, when it comes to data, we tend to be on the cautious side and keep storing it for years. Like you shouldn’t heat up cold coffee, you shouldn’t keep data too long. So reduce the risks of storing sensitive data, by deleting it when there is no longer a real need to keep it.
The period to keep data differs and should be based on the underlying business purpose. For example, if you need to keep it for regulatory reasons, like accounting data and financials, the period could be multiple years. For the purpose of forensics, data like log files and events coming from auditd might be useful up to a year. So depending on the data, define a clear point from where on data can be safely deleted. So throw away data when possible.
Keep also hardware and storage in mind. Hardware and storage can contain old data. Proper decommissioning steps should be applied. One of them is secure wiping of data from removable disks and storage media.
Passwords, passwords, and passwords
We all know that strong passwords are better than “Welcome01”, yet most systems and software still allow you to choose weak passwords. Use a module like pam_cracklib or pam_pwquality to enforce the usage of strong passwords.
Besides strong passwords, consider the usage of two-factor authentication. This means that you need 2 different forms of authentication to proof your identity. The first one is the combination of your username and password, the second one could be a token generated on your mobile phone. You can use a project like Google Authenticator PAM. This pluggable authentication module uses the common Google Authenticator app. It can be used together with SSH and other forms of authentication.
For Linux systems, it is also a good idea to lock out people after a few failed attempts. This limits the risk of brute-force attacks that try continuously to log in, reduces events, and can be a good trigger to watch for other types of attacks.
Make every person unique
Don’t use functional accounts for system administration. Instead, give each administrator their own account. This helps with accounting and keeps everyone accessing the system a little bit more honest if they apply changes under their own name.
Secure protocols only
In this day and age, the usage of telnet and other plain-text protocols should be avoided. Use safer alternatives, like SSH. Where possible, add encryption to each service that is available. One of them includes the protocol SMTP, which is used for sending emails. Even if not all mail servers may use encryption at this moment, the big hosters already turned it on. It helps against snooping and possibly also leaking sensitive information.
Got other tips or questions related to GDPR and technical requirements on Linux systems? Use the comments below.