Finding the Needle in the IT Haystack: Locating Data Across Disparate Systems
Jerisaliant
Author
The Data Fragmentation Problem
Modern enterprises store personal data across dozens or even hundreds of systems: CRMs, ERPs, HRIS platforms, email servers, file shares, cloud storage, SaaS applications, data warehouses, backup tapes, and more. When a DSAR arrives, you must search all of them. Missing a single system could mean an incomplete response—and a compliance violation.
Research indicates that the average enterprise uses over 130 SaaS applications, and that personal data can reside in both structured databases and unstructured formats (emails, documents, chat logs, images). The cost and time required to search all sources is one of the primary drivers of DSAR processing expense.
Step 1: Data Mapping
You cannot search what you have not mapped. A data map (also called a Record of Processing Activities under GDPR Article 30) is the foundation of DSAR fulfillment:
- System inventory: List every system that stores or processes personal data.
- Data categories per system: What types of personal data does each system hold?
- Data subject types: Does the system hold customer data, employee data, vendor contact data, etc.?
- Search capabilities: Can the system be searched by individual? What identifiers are available (email, name, account ID)?
- Data stewards: Who is responsible for extracting data from each system?
Step 2: Prioritize Systems by Risk
Not every system search is equally important. Prioritize based on:
- Systems holding the most personal data about the individual
- Systems holding sensitive data categories
- Systems likely to be specifically relevant to the request context
- Systems with automated search capabilities vs. manual-only searches
Step 3: Execute the Search
Structured Systems
Databases, CRMs, ERPs, and SaaS platforms typically support querying by identifier (email, user ID, name). Export the relevant records and compile them.
Unstructured Data
This is where complexity explodes. Email searches, file share searches, chat log searches, and document management system queries all require different tools and approaches. Key strategies:
- Use eDiscovery tools for email and document searches.
- Search by multiple identifiers (email, name variations, phone number).
- Include shared drives and personal drives if they are within scope.
- Do not forget archived and backup data, though proportionality may limit the obligation to search backups.
Step 4: Compile and Deduplicate
After collecting data from all sources, compile the results and remove duplicates. Present the data in a clear, organized format grouped by source or data type for the data subject.
Automated Data Discovery
For organizations receiving significant DSAR volumes, automated discovery tools connect to your data sources via APIs and run coordinated searches across all systems from a single interface. This dramatically reduces the time and cost per DSAR while improving consistency.
Jerisaliant's DSAR module integrates with common enterprise systems for automated personal data discovery, supporting both structured and unstructured data sources through configurable connectors.
Ensure DPDPA Compliance Today
Ready to make your business compliant? Run a free gap assessment or talk to our experts.