Managing the Data Lifecycle
Prevent data issues with a unified Business Information Model
A lack of uniform semantics could lead to misinterpretations of data. This does not mean that the quality of the data is an issue; it could still be accurate, complete, consistent and timely. To enable effective Data Quality Management focused on ‘pure’ data quality issues, it is of the essence to align business semantics through a unified model.
Misinterpretations by lack of uniform semantics
Imagine being the HR director at the start of the yearly forecasting and budgeting cycle. The overall recruiting budget falls within your responsibility and to accurately determine this budget you first need the current number of employees within the organisation. The various departments are requested to deliver this data in order for HR to consolidate this to one overall, reportable number. During data collection, significant discrepancies are found with last year’s numbers leading the HR director to assume that the quality of the data is an issue. The findings of a resulting root-cause analysis show that the data is actually correct, however, the business term ‘number of employees’ was misinterpreted as either headcount or full-time equivalents which led to different numbers across the departments.
The example above illustrates that a lack of uniform semantics could lead to misinterpretations of data. This does not mean that the quality of the data is an issue; it could still be accurate, complete, consistent and timely.
Aligning semantics with Business Information Modelling
Within the Deloitte EDM framework, the overall alignment and data definition standardization is part of the overall Enterprise Data Model, a subject-oriented and integrated model representing produced and consumed data within the organisation.
The top layer of the Enterprise Data Model consists of subject areas (e.g. product), which are further detailed into significant business entities (e.g. type of product) with accompanying descriptive attributes (e.g. product label). This top-down approach starts with the business requirements, data that is required by business functions within the organisation. The business where the data is originating, also is the owner of the data.
To effectively manage this data it is vital to have a uniform definition per business term. The modelling of business data and assignment of uniform definitions is captured in a Business Information Model (BIM). This model is the top layer of the Enterprise Data Model, focusing solely on subject areas, business entities and descriptive attributes with their definitions (see figure 1). The BIM is created by business users, preferably in close cooperation with a data management competence centre.
Having a BIM in place has several advantages, it enables:
- Top-down requirement analysis for application physical and logical data models
- Structured impact analysis for future requirements on data that are imposed by the external environment (e.g. governments, regulators)
- Availability of complete business context/attributes and relationships within the broader model
- Semantic uniformity: the harmonisation of different data dictionaries within the organisation to one uniform business information model: one definition for one business occurrence throughout the organization
- The retention of critical business knowledge by documenting the business language in an unambiguous and searchable manner
- Facilitating communication within the organisation by harmonising key business terms and creating a common view on business language
Data Quality Management to ensure fit data
In order to ensure that data is and stays fit for its intended purpose throughout its entire data lifecycle, it needs to be effectively managed. Data Quality Management (DQM) is the discipline of effective and continuous measuring, monitoring, improving, and reporting conformance of data against predefined DQ requirements set by its users. DQM allows companies to identify and resolve DQ incidents and issues when they occur, minimizing or even preventing impact on critical business processes.
Data elements that are used within critical business processes need to be continuously measured and monitored against predefined DQ requirements set forth by the data users. Any arising DQ incidents or issues need to be resolved in an effective and timely manner.
In order to measure DQ, data users need to articulate DQ requirements and related business rules according to various DQ dimensions such as accuracy, completeness, consistency and uniqueness. An example of a completeness DQ business rule is: “The ‘number of employees’ field cannot be empty”. Note that although this rule will improve the DQ of reported ‘number of employee’, it does not check on, for example, the accuracy of the number. A DQ business rule to further improve the DQ is: “The ‘number of employees’ cannot exceed the total number of employees of the company”.
These DQ business rules can be implemented through Data Quality tools for continuous measuring and monitoring. Records that fail the pre-defined business rules (DQ incidents) can either automatically be corrected (by the tool) or manually by whom are accountable for meeting the DQ requirements. Recurring DQ incidents tend to have a structural underlying root-cause that needs to be identified and resolved in order to reduce the amount of incidents – in this situation, we are referring to issues rather than incidents.
The positive impact of semantic uniformity on DQM
Now that there is an understanding of the two disciplines, the impact of semantic uniformity on DQM can be discussed. As stated earlier, DQM focuses on DQ requirements and dimensions such as completeness and accuracy and should explicitly not focus on semantic discussions. Figure 2 shows that by having a BIM in place, it prevents perceived DQ issues caused by semantic misinterpretations. This enables effective DQM to focus solely on DQ requirements and dimensions.
Looking back at the stated example from the introduction, it illustrates that due to the absence of a unified BIM, the delivered data was not fit for its intended purpose, impacting critical HR business processes. This issue could have been prevented if data producers and data consumers all have the same interpretation or context for the commonly used business terms. A BIM is a solution for this issue by identifying and creating two separate business terms with unified definitions that hold a hierarchal relationship with number of employees, namely FTE and Headcount. Future data requests by HR should explicitly refer within the BIM to either FTE or Headcount to align semantics and enable effective DQM.
More information on Business Information Modelling and Data Quality Management?
Please feel free to contact us for more information: