The real cost of poor data management

I have often stated my belief that in a typical organization approximately 30% of all resource is expended on avoidable data-related activities. This figure comes from a number of sources and some estimation on my own part.

There are quite a few studies, research surveys and papers on the impact of bad customer data. These often focus on the impact of losing revenue (e.g. the Data Services Insight Report from Royal Mail – which concludes bad customer data costs organisations on average 6% of annual revenue).

However, this is potentially the tip of the iceberg in terms of the cost of bad data.

The first estimate to make is that if – as the Royal Mail study found – 6% of revenue is lost because of bad customer data, how much time is spent on dealing with the issues the bad customer data created? We can imagine that there would be customer service calls and actions that would not have been necessary if the customer data was right. If 6% of revenue was lost, then it is likely that considerably more than 6% of customer service time was spent dealing with these issues (and avoiding an even greater loss of revenue by retaining some customers).

Looking organization wide, and beyond customer data, the cost of bad data is also a substantial issue. Neil Patel’s article quotes the Gartner average cost of bad data as $14M per year of the organizations surveyed, and highlights how this cost is likely to escalate. This is largely considering the cost of identifying and correcting the bad data (which Gartner estimate is 20% of all data). IBM estimate that poor quality data costs the US economy $3.1 trillion per year – that’s around 15% of GDP.

In our ‘generic organization’, these costs are manifest in ‘data quality programmes’ or ‘data cleanses’. And I would argue that the above evidence would lead us to conclude that there’s somewhere in the order of 15% of resources spent on data quality.

However, data quality is just one of the issues (and one of six data management areas, based on the CMMI model). The operational impact of having bad data may be significant, but so is the inefficiency caused by not being able to rely on well-governed data, and therefore not identifying and/or implementing efficiencies that should be possible, and lacking the confidence to make evidence-based decisions.

That impacts the opportunity to make a good decision, or the right decision, and even the ability to know when a decision is to be made. Poor decisions by definition leave something on the table – timeliness, quality, money, satisfaction or experience to name a few. Each of these requires resources to mitigate.

Consider, for example, bad data about a lift warranty that might lead a landlord to contract a repair that should have been carried out under the warranty. This would have a cost of the repair, and potentially an additional cost or remedial action to retain the warranty. There would be costs associated with the management time of all of this, as well as dealing with the likely complaints from occupants about the unnecessarily extended time the lift was out of service.

That’s a big deal.

But bigger is the process and governance framework that allowed the data to be bad in the first place. Fixing that – in this case for example rethinking the development and construction process to clarify the data lifecycles and adopting supporting tech like BIM effectively – would not only avoid this costly warranty error, but provide new capabilities that would have significant impacts on other operations.

For example, simple and intuitive ways for residents to interact with building data (as will be required in the new housing safety legislation), or the ability to respond much faster to the industry’s next building safety challenge. These are currently costs borne on an adhoc, reactive basis and would not be impacted at all by addressing data quality in isolation.

I estimate costs of this kind to be in the same ballpark as the data quality related costs. I may be underestimating the costs of these currently, but I’m cautious not to make too large a claim as there are also significant investments required to achieve the higher levels of data management maturity required to reap these rewards.

I would be interested to hear your thoughts on this too – get in touch at