? Data Governance – Data quality: SEEOcta Project management data
SEEOcta

SEEOcta Data: Data Governance – Data Quality

| | Director Business Unit E-Invoicing/SAP&Web Process, SEEBURGER AG
SEEOcta Data

For many companies, having a large pool of data to crunch is vital to their survival. This makes it all the more surprising that there are a number of companies who have have not developed organisational rules for this. Data Governance is a suitable framework to use as the foundation for dealing with and managing this data for all involved. The greater and broader your data pool, the more important it is to have a secure, powerful and traceable way to make this information available. And to use this data efficiently, including as the basis for new, data-based business models, the data you feed into your system needs to be top quality.

Part of our SEEOcta series, this post looks at project management from a data perspective, more specifically, data governance. Further blog posts in this series from a data perspective look at seamless data exchange, big data and optical character recognition (OCR).

The SEEOcta blog series highlights the eight most important perspectives for successful project management. Discover all the areas you need to consider when planning digitalisation and integration projects in your company. Armed with the ideas and knowledge in the articles, you will have a solid foundation for planning your IT project and a guide to help you ensure that no one gets left behind.

Data Governance – Getting the flood of data under control

We live in a digital era. Data is becoming a commodity, which can be used as the basis of a number of business models. How important data has become in our everyday life is impressively illustrated by these following examples: It is forecast that the amount of data created worldwide will reach 175 Zettabytes (17.5×1021)[1]. Compared to the amount of data existing in 2016, within those 9 years, data have increased tenfold to an incredible 175 hundred thousand 1TB hard drives worth. Should you want to try and save these 175ZB of data onto DVDs, they would build a pile 23 times greater than the distance between the earth and the moon[2]. In 2020, the amount of data created, shared and consumed worldwide had a volume of around 40 Zettabytes. This is is 50 times higher than four years beforehand and is estimated to be 57 times higher than the number of grains of sand on all the beaches in the world. What is more, it is expected that the worldwide data volume will double every two years.[3]

But where is all this data coming from, and how can we keep it under control? Due to the steadily increasing worldwide networking of people, machines and companies, a huge amount of data on customers, suppliers, products, processes, machines and staff is being generated and saved for various purposes. The issue is that only a few companies have a strong corporate data governance policy. This means that despite all the potential in this data, often only a fraction is used well. And if that weren’t bad enough, a large proportion of this data is of poor quality. This means that, at best, companies are merely wasting time having to correct a number of errors. At worst, there are dissatisfied customers, expensive inventory errors and failed IT projects.

  • Errors in the database cause errors in the reports they generate;
  • A lack of confidence in the data leads to bad decision making;
  • Opportunities are lost due to out-of-date or incomprehensible data.

In 2016, the company IBM estimated the annual costs caused by bad data quality to be 3.1 billion US dollars in the US alone.[4]

What is data governance?

You can counter the above with intelligent data governance. In the study How to rule your data world pubished by the Business Application Research Center (BARC) in November 2018[5], data governance is defined as follows:

  • ‟Data Governance refers to the individuals, processes and technology required to manage and protect enterprise data assets. Its goal is to ensure interpretability, correctness, completeness, trustworthiness, security, accessibility and traceability of enterprise data in an efficient and effective manner”[6]

Errors made in corporate data governance

Figure 1: Errors made in corporate data governance (Source: Mit Data Governance Risiken minimieren – b.telligent)

The image above covers a number of issues currently found in companies. Only once data is clearly traceable from source to analytics (of whatever kind) with a full log tracking all changes and movements, can a company say with confidence whether their information is correct.

  • Data governance provides an organisational framework for dealing with data, defines roles and areas of responsibility and supports data use in an organisation, while also laying down the rules of play. [7]

Asked in the BARC study about the benefits of implementing data governance measures, the majority of repondents (53%) said this had led to enhanced decision-making support and a unified understanding of data. Furthermore, establishing a data government policy had helped create the conditions for data-driven work and to become a digital company. (47%).

Top 6 benefits given when asked what implementing data government measures had (n=83)

Figure 2: Top 6 benefits given when asked what implementing data government measures had (n=83) (Source: How To Rule your Data World, BARC, November 2018, Page 7)

Data Governance – Only clean data is good data

The goal of data governance is to improve the quality of data, maintain this at a high level and to improve it. In order to achieve this, we need to assess how accurate, complete and up-to-date the data is. Depending on how the data is used, it may be necessary to look at aspects like how easy it is to process and access.

A critical point for the quality of data is the data entry point, where data originates in an organisation. The better those working in data entry understand the bigger picture – the context in which the data is used – the more careful they are likely to be that the data entered is accurate and complete.

  • A high data quality in customer relationship management (CRM), in which customer details such as title, e-mail address, function, etc, are entered accurately, ensures that marketing campaigns run smoothly and the right customer or company is sent the right information.

However, data is not just entered manually. It is also captured digitally from external sources, and needs to be stored in the right place for the right purposes, so it can be processed correctly. Sources of errors therefore need to be recognised early and fixed. In practice, data quality can be defined by a number of characteristics. These have been summarised in the image below.[8]:

Data Governance – Selecting and classifying characteristics determining data quality

Figure 3: Data Governance – Selecting and classifying characteristics determining data quality (Source: Detail – Gesellschaft für Informatik e.V. (gi.de))

The following three culprits are often behind poor data quality:

  • Incomplete data
  • Inconsistent data
  • Incorrect data

These issues can arise for a number of reasons. If we take a closer look at a typical database, there are many ways in which data can be entered incompletely, inconsistently or even incorrectly:

Data Governance - Bad data in a database

Figure 4: Data Governance – Bad data in a database[9]
  • Representation
  • Contradiction
  • Referential integrity
  • Non-unique value
  • Typo
  • Wrong value
  • Missing values (empty cells)
  • Duplicate value

Automate data governance measures to ensure and maintain your data quality

Good B2B communication platforms can ensure the quality of your data by running automated validation checks on it when it is received. These validation checks need to continue at every change made to ensure its integrity when used as a basis for decision making. By setting rules and constraints, you can require certain fields to contain certain types of value in specified formats. The system can then run a data-type validation check to ensure that a field requiring a date has a value in the format DD/MM/YYYY. It can run a range and constraint check to ensure that a field requiring a German postal code contains data only in the range to 00000 to 99999, and it can cross reference data such as post code, telephone area code and town or city to pick up any obvious errors.

These validation rules and checks can significantly improve the quality of the data being entered. By checking whether data being entered matches the pattern set by the rule, bad, false or incomplete data can be filtered out right at the beginning.

Data Governance – Secure data quality by setting constraints
Figure 5: Data Governance – Secure data quality by setting constraints

This clean data is then passed along the various points in the business process. If data needs to be sent to trade partners, you can also employ validation checks at this point, just before the data leaves the company. Intelligent programming can even ensure that data which is there twice from merging or combing datasets to get a more complete or reliable picture, is removed.

Conclusion

No organisation can afford to ignore the issue of data governance. However, it shouldn’t be seen as a single measure, or even a project. Rather, data governance is a genuine opportunity to increase the value you get from your data. By increasing the transparency over your data, you are able to use it to further develop business models. Ensuring that your data is high quality is central to the success of your organisation as, without being able to intelligently access and use good quality data, you are no longer in a position to meet the customer and market demands of today.

SEEBURGER‘s Business Integration Suite is an extremely comprehensive platform which gives you the tools to stay ahead of digital change. It can also support you in your company’s data governance measures.

This post is part of the SEEOcta series. In the blog category SEEOcta you will find all of the collected posts of this series related to the introduction of a new IT project.

[1] https://de.statista.com/statistik/daten/studie/267974/umfrage/prognose-zum-weltweit-generierten-datenvolumen/

[2] https://blog.wiwo.de/look-at-it/2018/11/27/weltweite-datenmengen-sollen-bis-2025-auf-175-zetabyte-wachsen-8-mal-so-viel-wie-2017/

[3] https://www.welt.de/wirtschaft/webwelt/article118099520/Datenvolumen-verdoppelt-sich-alle-zwei-Jahre.html

[4] https://hbr.org/2016/09/bad-data-costs-the-u-s-3-trillion-per-year

[5] Download Now – Erreichung einer bisher ungeahnten Performance mit vernetzter Fertigung und Echtzeitfähigkeiten (sas.com)

[6] How To Rule Your Data World – Copyright CXP Group 2018, Page 3

[7] Mit Data Governance Risiken minimieren – b.telligent

[8] https://gi.de/informatiklexikon/datenqualitaet

[9] Figure 4 are based on the white paper „Fünf Schritte zu hochwertigeren Unternehmensdaten“, Eberle GmbH, 2013, PP 3

Get in contact with us.

We are looking forward to your message.

Share this post, choose your platform!
Rolf Holicki

Written by:

Rolf Holicki, Director BU E-Invoicing, SAP&Web Process, is responsible for the SAP/WEB applications. He has more than 25 years of experience in e-invoicing, SAP, Workflow and business process automation. Rolf Holicki has been with SEEBURGER since 2005.