BP Perspective Insights from a Business Partner

Why Data Discovery Should Be the Priority for Law Firm IT Staff

Anyone doubting whether data is the lifeblood of today’s law firms should try to lead a new system upgrade or conversion project. That will quickly reveal the importance of knowing the state of a firm’s data — and how haphazard data management practices can be. 

Bim Dave

Even if a firm is not contemplating any significant system changes, data discovery should be the No. 1 quality assurance task carried out each year by IT staff. Not only is it essential if a firm decides to migrate to other platforms and systems, but it can also offer unique business transformation opportunities.

Done correctly, data discovery allows a firm to see its data holistically and determine how current processes affect the bottom line. That permits leadership to make management decisions based on accurate information about current and future operations. Unfortunately, firms rarely take data discovery seriously as a routine part of IT operations.


Data discovery determines the current state of a firm’s data management — the good, the bad and the ugly. The mechanics can vary from laborious manual review to an automated process. Regardless, data discovery establishes a baseline for comparison between how the firm’s data is intended to be recorded, stored and used and how it is actually recorded, stored and used.

This benchmarking is crucial for firm management because leaders get a bird’s eye view of the firm’s data and its quality. That allows high-level questions such as:

  • Does current data usage align with expectations?
  • Is data usage supporting our business goals?
  • Are there unexpected or hidden roadblocks hindering daily functionality?
  • Are downstream and upstream systems being impacted negatively by poor-quality data feeds?

This “map” of a firm’s current data environment provides a trove of valuable information executives can leverage to create a relevant and future-focused technology infrastructure.


Data profiling is the process of examining, analyzing and summarizing data from an existing source and is the most crucial part of data discovery. It addresses the fact that data quality inevitably erodes over time, with each slight anomaly contributing to a larger web of inconsistencies. Inevitably this degrades data, lengthens processing times and generally mucks up systems.

Because data degradation often happens without a severe impact on workflow, a firm will keep operating as usual — but everything it does will be based on faulty data.

With data profiling, firms will spot quality issues quickly and easily. A report will show, for example:

  • How data is distributed in core tables
  • Where bad dates, incomplete records or orphan records are hiding
  • Where and how data entry protocols differ
  • How setup tables are syncing to actual data

With this report in hand, decision-makers can determine next steps: Which data can be purged? Are all the data fields still pertinent? Do the setup tables (e.g., ledger codes, etc.) need adjusting? Not only do these decisions improve data quality, but they also boost data relevance, making everything from time entry to billing concise, efficient and accurate.

Skip to content


A functioning contact management system is crucial for client service, yet this is probably some of the most neglected data in a firm and rife with duplications, incomplete records, etc. Done well, data discovery makes cleaning address data a manageable, efficient process. 

First, it uncovers common inconsistencies and reports them in an actionable format using data profiling. A typical report can show:

  • How many records are missing or have invalid postal codes
  • Discrepancies in state indicators (spelled out versus abbreviated)
  • How many records have names in one field versus two fields
  • How many times the same surname is listed and whether the listings are duplicates

Armed with the data profiling report, tech teams can see what cleanup work needs to be done and where they need to do it. Their next step is to parse the data, which puts it into a consistent format that is easy to work with. Once parsed, irrelevant data can be purged, field structure changes can be implemented (for example, states always abbreviated) and data entry protocols can be established (such as always splitting names into two fields).

Cleaning address data is one of many use cases for data discovery. While it can be a laborious process, there are many benefits that can be leveraged across the firm. Taking the time to do a data reality check regularly is a value-add for a firm’s business, client relationships and, ultimately, the bottom line. It’s also one of the best ways to maximize return from technology investments made to enhance a firm’s competitive edge.