big data

Measure Data Quality Impacts to Your Business

David Kuketz | February 06, 2020 | 4 min read

The four (4) key data quality (DQ) dimensions are completeness, consistency, conformity, and consolidation (uniqueness). Data quality can be improved via cleansing (normalization, standardization), classification/coding, creation (enrichment, enhancement, construction), and consolidation (deduplication). Once you get data clean, you need to keep it clean. Perfect data equals perfect business (assuming people, process, and technology are not potential sources of failure).

A common measurement of DQ is to assess the proportion of data sets where the data is “clean” on a dimension, based on the standards being used to define “clean.”  The definition of “clean” may vary across industries, across companies, and data types (finance, vendor, customer, material, assets, …).

The perfect data score is 100%. As the number of dirty records goes up, the score decreases. Bad data causes poor business process execution. As the DQ score decreases, the more likely it may impact your business.

The value of data that unlocks your business potential can be thought of using the following formula:

    Volume of Data x Data Quality x Velocity (Usage) x Average Transaction Value


  • The volume of data is the number of records across all your systems involved in transactions
  • Velocity is the usage of the data or the number of transactions powered by the data per year
  • Transaction value is the cost associated with the supply chain, end-to-end, driven by the data
  • Data quality is the score that expresses the likelihood the data is clean

Do you know your data quality dimensions? Are they measured consistently? How are they measured? Are they monitored over time? Do standards, rules, and workflows enforce them? What is your overall total score for each material type, or business object, or data domain in your business? Examples are Materials, Vendors, Customers, Equipment, BOMs, Work Centers, Article Master, Financials, and so on. How does data quality impact your business locally and globally? How does it affect your constituents (partners, suppliers, customers)? Is data an asset or a liability?

One interpretation of the overall data quality score is to multiply the four (4) key data quality dimensions because any business transaction using that data set has the probability of not being impacted by bad data. If all four dimensions are perfect (100%) then 1 x 1 x 1 x 1 = 100% not impacted by bad data. If you have business execution problems, at least, data is not a source of failure.

The reason to have this knowledge is to reduce the chance of increasing costs of doing business by improving data quality. Three examples of Overall Data Quality (ODQ) are:

  • Three (3) quality dimensions are 100% perfect. Duplicates are 10%. The quality score of duplicates is 90%. The overall quality is 100% x 100% x 100% x 90% = 90%. There is a 90% chance a transaction will not be negatively impacted by a randomly chosen record:
    1. That means 10% of the time there is a chance your business could be impacted; 10% of your transactions maybe are being reprocessed or have waste, perhaps excess inventory, etc.
  • Four (4) quality dimensions are 90%. The overall quality is 90% x 90% x 90% x 90% = 66%. There is a 66% chance a transaction will not be negatively impacted by a randomly chosen record:
    1. That means 34% of the time there is a chance your business could be impacted.
  • Four (4) quality dimensions are 90% x 98% x 56% x 77% which yields an overall score of 38%:
    1. That means 62% of the time there is a chance your business could be impacted!

If my business had a 62% chance of being impacted, randomly, I would never sleep at night. One never knows when the impact happens, how severe that impact might be, or how much it is going to cost. Sometimes, in worst-case scenarios, it is life and limb – we never want that to happen.

It would be good to know if your data quality reports indicate how many records have all four error types, or three error types, or two error types, or just one error type, and of course, how many records have ZERO errors (excellence in data, perfect data). You want excellence across all your data. Why not eliminate it as a potential source of business execution failures?

It would be good to have an ODQ program with governance to help assure data excellence so that you could improve data quality and have a way to monitor and measure that improvement, as well as assess the reduction in business impacts over time. That means having a way to measure ROI of data as-is (now), and to-be (future), repeatedly, in a systematic way.

The point is, when a transaction is executed with bad data, one of the following may occur, which costs you money and time. Depending on the severity of the kind of incident, such as HSE, the occurrence could require a lot more than just money and time:

  • Data Quality Error is Not Noticed
    1. Downstream business execution results in a big problem which eventually must be dealt with, like an injury incident, a regulatory non-compliance, wrong product to wrong customer at incorrect price at the wrong time to the wrong place, reshipping/restocking fees, rerun procurement process, inventory bloat, and so on; depending on how often this occurs, the overall cost impact is difficult to quantify
  • Error is Noticed, Fixed in Transaction
    1. The transaction cannot be automated, requires human intervention, which causes potentially costly delays and extra labor charges; hopefully, the original error record is repaired in the data system of record, so the problem doesn’t reoccur, but most often, the error is fixed within the transaction itself, not in the source data
    2. Side Effect: Non-automated transactions cannot go into machine-to-machine communication transactions, and cannot be used for predictive maintenance, nor other highly automated “Intelligent Enterprise” styled S/4HANA-based initiatives; this means bad data is costing you money and preventing you from a successful transformation
  • Error Cannot Find Existing Data in System (even though the data is there, false negative)
    1. Because data is bad, a search for it results in no match and wastes time, so, likely a duplicate record is created (of course to speed up the immediate need to execute a transaction) which is itself a data error, which further reduces your data quality, because
      1. Creates a duplicate (consolidation error)
      2. Most likely there are no governance or data standards, so the newly created record is non-conforming, non-standardized, incomplete, etc. (which further not only drives down your ODQ score but causes additional business impacts and higher IT costs for data storage and maintenance)

This is the importance of Data Quality Monitoring and Remediation (uDQR) (a Utopia addon for SAP MDG) and a Data Health Assessment (DHA) (a Utopia service); to determine if you have bad data and to show bad data’s propensity to create costs, to destroy value, to make work difficult.

Utopia uses DHA combined with Strategic Consulting initially to help you lay out a plan to start avoiding those costs by jointly laying out a plan to get the data clean and keep it clean. We want to help you create value and help avoid costs. We can help you calculate the value too. With perfect data, which is perfectly possible, eliminate data as a potential source of business execution failure!

Look for another blog post about the cost of the hidden content factory in your company.

Contact the Utopia Strategic Value Management (SVM) team at to learn more. Thank you.


Contact us today for a 15-minute discussion with one of our Subject-Matter Experts