Data Loss Prevention (DLP) is one of those terms that is often mentioned but less often defined. The term can be as ambiguous as its scope which can be both large and small. So what is DLP and why does it matter?
Data Loss Prevention (DLP) is an effort to reduce the risk of sensitive data being exposed to unauthorized persons. Data is extremely valuable to organizations. Just think of trade secrets, financial information, research data, health information, personal information, source code or credit card numbers and you begin to understand both the value this data holds for the organization and the threat its unauthorized disclosure would have on a company. Data loss prevention focuses on this threat by enacting controls to limit access and distribution of data. DLP still establishes controls to restrict outsiders, but it has a major focus on controlling the usage of data within the organization.
Information security efforts have historically been focused on preventing attacks from outside the organization. Controls such as firewalls, network segmentation, and extensive physical controls try to keep the bad guys out, but this is only part of an information security framework. Numerous studies (see further reading below) have identified the weakest information security link as human error or insider threats.
One method DLP uses content filtering. Content filtering blocks communication leaving the organization by filtering instant messages, emails, file transfers web pages and many other data transfer methods. DLP programs need to be able to work with many different data types and transmission methods. For example, a user may email a sensitive word document or they may store it on an unencrypted flash drive or download it to a mobile phone. Each of these scenarios and thousands more needs to be handled by DLP.
The first step is to determine what data needs to be protected. Above we mentioned trade secrets, financial information, research data, health information, personal information, source code or credit card numbers. These are just some examples of the data an organization holds. Organizations need to determine what to protect and to what extent it should be protected by determining the criticality of each type of information to the business and the loss the organization would incur if the data were to be disclosed to unauthorized entities.
Once the organization understands what it needs to protect, data loss threats to this data can be identified along with effective controls to mitigate such threats. One way to more effectively identify threats is to consider the different states data can be in. These states are as follows:
Data at rest – data that is stored such as data in databases, file shares, backup tapes, laptops, or external storage devices. Data at rest is an important state because it is here that data spends most of its time.
Data in motion – data that is being transmitted from one location to another. As data changes state from being at rest to being in motion, it may become unencrypted or travel over an insecure network. This is why it is important to look at this phase.
Data being accessed – data that is being used by a user such as an open Word document, a report being viewed in a conference room, or statistics displayed on a cell phone widget. Data being accessed has already passed many information security controls, so it is available to the authenticated user. It may be available to others as well. Threats such as shoulder surfing, unlocked and logged in desktops, and printouts on a desk are all potential ways data can be exposed.
Let’s consider a case study for one type of data so that data loss prevention becomes clearer. A small business determines that financial data needs to be protected. The financial data is stored in a database that is attached to a managerial portal on the company intranet. Accountants use a custom application to input financial data into the database. Each week, managers generate reports and store them on a shared drive. The database and the shared drive are backed up nightly to tapes that are stored in a vault at the company headquarters.
This case study already identified the financial data as something that needs to be protected from disclosure. The company further specifies that financial data should be available only to managers, accounting staff, executives, the IRS, and outside auditors.
First, we will look at the data at rest. The data is stored in the database, file server, and on backup tapes. Data loss prevention can protect the database by limiting the accounts that can directly access the database and by assigning the minimum level of access to each account. The information security data loss prevention system would next establish strict access controls to the file server share and the file server itself. We need to consider the administrative access to the server because anyone who can log onto the server with administrative credentials will have access to the shares as well. Administrators will need to be restricted to one of the groups identified as having access above. Tapes could be encrypted and stored in a separate area for less sensitive data.
Next, we look at data in motion. The data is in motion when it is accessed through the intranet. Granular access controls could be established for intranet access, and the communication channel could be encrypted.
Lastly, data being accessed would include viewing reports through the intranet or updating accounting data by accountants. Client-side caching of data would need to be restricted as part of the data loss prevention system. The accountants also interface with the data through the custom program. This program would need to be evaluated for any information security holes including developer access to financial data. Now, what would prevent managers from storing the financial reports on their local machine? With the information given, we do not know if this happens, but it would need to be addressed possibly through a policy stating that the reports cannot be stored locally or by encrypting local hard drives.
This simple example addresses only a small part of data loss prevention. A true information security analysis would include much more than this, such as whether computers accessing the data contain malware or what to do if financial data is emailed or sent via instant messaging. Additionally, it is not enough to just say that data should be encrypted. A detailed design needs to be specified for the encryption if the data loss prevention controls are to be effective.
Bruce Schneier points out the importance of a well-architected data loss prevention design in his June 2010 article “data at rest vs. data in motion” where he discusses encrypting credit card information for use in a website.
If the database were encrypted, the website would need the key. But if the key were on the same network as the data, what would be the point of encrypting it? Access to the website equals access to the database in either case. Security is achieved by good access control on the website and database, not by encrypting the data. Bruce Schneier
Those implementing data loss prevention need to have a good understanding of how to architect information security controls and to implement controls in layers so that if one control is compromised another control still prevents data loss. Remember, information security is only as effective as its weakest link.
Data loss prevention is a worthy goal and an excellent information security initiative but it requires high level decision making from the beginning and a comprehensive analysis of threats and controls. An understanding of the work flow surrounding organizational data and a detailed design for each control in order for it to be effective is also imperative.