Consider the following:
- A single bank transaction may get replicated across 100 systems.
- Storage is so cheap that enterprises collect petabytes of data each year and keep almost all of it.
- Data is routinely propagated across the enterprise to support a wide variety of users and business initiatives.
Unfortunately, the massive growth in data collection and proliferation has not been accompanied by an equally matched effort in data management and governance.
The consequences have been painful. Data breaches. Misuse of private data. Loss of consumer trust. In response, companies have poured resources into implementing security controls to block or restrict access to their data. But whereas Security is focused on who is using the data, Privacy is about how the data is being used and for what purpose.
Meanwhile, regulations like GDPR and CCPA are obligating companies to respect and respond to Data Subject Access Requests (DSARs) like the “right-to-be-forgotten”. But achieving basic compliance requires that companies understand what personal information they have, where it’s located, and its purpose. Up until now, the basic data inventory process has been a manual one consisting of application data owner surveys and spreadsheets.
DSARs push the manual process to its breaking point. Not only in people resources required to manually search those 100 systems in the bank example for each DSAR, but also in the accuracy and completeness required to be defensible with the regulators. It is a big data problem and a new approach is required to process petabytes of data, extract key data points and derive the relationships between them. Companies have been left scrambling to meet their obligations.
Five Critical DSAR Process and Fulfillment Capabilities
The five critical DSAR process and fulfillment capabilities are intake, verify, search, deletion, and response. DSAR fulfillment is critical to both the California Consumer Privacy Act (CCPA) and General Data Protection Regulation (GDPR) compliance requirements. While CCPA and GDPR have their own unique take on DSR processes, these five critical capabilities are a must for any data privacy and data management initiative.
During intake, a data subject makes a request via email, an online form, or other communiqué. The enterprise then needs to verify the requestor’s identity and existence within the data ecosystem and track the request fulfillment through to resolution. All within the required timelines (30-45 days depending on the regulation).
The next step is verification of the identity of the requestor. For companies which provide services online, this step may require customers to login and verify their identity. For regulations like GDPR, which may include employees and vendors, this requires that the enterprise confirm the existence of the data subject anywhere in their ecosystem and then identify corresponding info to include in the response.
In order to fulfill the request, the enterprise will need to locate a requestor’s personal data by searching across its data ecosystem. The type of information the enterprise will be searching for will differ based on data subject type. For example, is the data subject a current customer or a former employee? CCPA only applies to ‘California consumers’ whereas GDPR also includes employees and contractors (privacy by design would look to encompass current and potential future scenarios). The search process identifies relevant PI attributes, categories, and the company’s purpose for collecting and processing the subject’s information. The search then needs to identify the specific systems and locations that contain the data subject’s personal data.
For deletion requests, the enterprise will need to validate which systems the data can be deleted from, based on regulatory or business constraints. An example of a business constraint might be a warranty registration database that contains personal information. The enterprise cannot delete customer information from this database because it impedes the ability to fulfil a legal obligation to provide a customer with, say, an extended warranty on his purchase.
Next, the enterprise will need to initiate a DSAR process to delete or obfuscate the customer’s data from the relevant systems, as well as request the same from third-party data processors. Lastly, the enterprise will need to audit and confirm the deletions.
Templates help ensure an efficient and consistent DSAR fulfillment process. All communications and activities should roll into a reporting dashboard and audit trail to demonstrate accountability, compliance, and progress towards resolving requests.
Which of these five DSAR capabilities is the most challenging?
For many organizations, the most complex, tedious, and resource-intensive step in the DSAR process is finding PI and tying it back to the data subject.
Why is identifying data subjects and their sensitive data so complex?
Not only has data proliferated, but it’s also mutated into derivative forms. Customer data is often collected across multiple channels without being linked to a master identifier. Also, when downstream systems aren’t updated there can be discrepancies between primary and secondary systems.
To make matters worse, both the regulatory environment and what’s considered sensitive data is changing. CCPA defines personal information that “could reasonably be linked, directly or indirectly, with a particular consumer or household.” The word “household” is not found in GDPR. It implies that personal information does not have to be tied to a specific name or individual (think home address, home devices, geolocation data, home network IP addresses, and the like).
Resolving identities across hundreds of sources is a data processing and data quality nightmare. The vast majority of companies simply do not have the tooling in place to access and monitor the volume, variety, and velocity of personal data flowing in, out, and across their organizations.
Master data management to the rescue? Not so much.
Many medium and large enterprises have implemented master data management systems (MDM) to resolve identities and create a golden record for interacting with a customer. MDM and customer data platforms hold the promise of delivering a 360-degree view of the customer to improve sales, service, and growth.
However, “customer” is often defined in different ways across an enterprise and that definition does not always equate to an individual. Also, data subjects can look different across data sources and business scenarios because of:
- Middle initials and suffixes
- Maiden names
- Different email addresses, phone or postal addresses
- Address changes
- Typos in addresses or abbreviations
Even when companies build master data management processes, they typically identify a few trusted sources from which to provide inputs.
And of course, not all personal data is tied to a user ID. Even without an ID the individual can still be identified in a data set. By simply mapping IDs to pre-existing metadata, the enterprise can run the risk of creating a false sense of security about the data it has, which security parameters are being applied, and whether it is in compliance with regulatory mandates.
Finally, while CCPA applies to California consumers only, GDPR applies to all data subject types such as customers, employees, vendors, and partners.
Privacy’s universal data subject view
MDM and other customer platforms and processes are not tuned to a ‘universal data subject’ view which includes multiple data subject types like customers, employees, and vendors.
The result is that companies have to either enhance their mastery of ‘the customer’ to include other data subject types or build a new identity resolution process. The business case for embarking on such projects has not been compelling up until CCPA and GDPR. Outside of privacy, there are few compelling business use cases that require consolidating massive amounts of personal information on different types of data subjects.
Enterprises simply do not have the intent or willingness to spend valuable resources and embark on an expensive journey to create a master data subject view.
How to Tackle Common DSAR Process Challenges
Integris DSR overcomes these challenges by automating the discovery and classification of personal information across all data sources (cloud, on-prem, streaming) and then providing deep search functionality to locate data subjects across thousands of systems.
Integris Software’s DSAR solution is fast, accurate, and follows the principles of privacy by design. The system also includes many industry firsts, including:
- PI Surface Area Reduction: An inventory risk assessment and remediation process that isolates systems that contain PI, then maps attributes, categories, purposes and sources back to each data subject to reduce a company’s data exposure
- DSAR Lifecycle Management: Provides easy to use DSAR lifecycle management for request intake, workflow management and response generation processes. It also offers API integration for companies to combine with existing intake tools and ITSM systems
- Data Subject Validation: When a company receives a DSAR request, they can easily confirm that a data subject exists within their data ecosystem to begin their workflow process, and also know if the subject is classified in different personas through just-in-time ID search (e.g. customer, employee or vendor).
- Data Subject Deep Search: The first and only search feature that quickly identifies a data subject’s relevant PI (personal information) across all data sources, tables, and files. It utilizes machine learning and contextual awareness to operate at the data element level to discover PI, even if it’s not tied to a user ID.
- Remediation and Validation: Provides additional metadata and event orchestration support for end-to-end workflows (e.g. deletion) to deliver an audit trail with validation that demonstrates compliance. Combining sampling scans and deep search capabilities also enables companies to achieve faster DSAR defensibility.