Trustworthy data with Data Lineage Approach
You may have heard it too – data is the new oil. But if data is to help you run your business or decide based on it, you should consider its proper managing. Moreover, you need to know its full process from data sources to its consumption. By Data Lineage approach, you can get to know all details of data origin, how it got there, and how it is flowing through the business.
Track your data – from its origin to its consumption!
Data Lineage process could be defined as the data journey mapping. During this whole process, there are several phases that need to be processed to properly understand the data flows through the company’s systems.
We can identify 4 phases of Data Lineage process:
- Key Data Elements – identification of key business users and organization during several workshops to map critical points for business function.
- Data Flows Mapping – based on inputs from workshops, AS-IS mapping of data flows will be processed – each data flow needs to be described from its origin to a consumption.
- Consolidation of data information – after an initial mapping of each data flow, all details about data sources, their forms, links, and other elements will be added to those data elements.
- Data Lineage Master Map – to prepare the whole picture of individual data flows maps, one master map will be established with links to detailed views.
Why is data lineage important?
- Data trustworthiness – to become a data-driven organization, you need to ensure quality data for all company’s departments because those departments will rely on data – they need proper data to be able to make business decisions.
- Changing data and requirements – big data and its consolidation could be really challenging since there are several current source systems and additional ones planned to be implemented, or there are just increasing requirements for data and its consumption.
- Data Management & Data Governance – the overall process of data lineage allows a good basis for regulatory compliance and risk management compliance. Disciplines of data management also ensure proper data description and master data management so your people will not struggle where to find data and what each attribute or metric means because it will be documented.
- Data flows map – by implementation of this process, you will get an overview of all data sources and the further tracking of data flows could be automated by Data Lineage tool.
Data lineage helps to make your data and its flows clearer. So if you think about your data, can it provide you with the right results? If the answer is no, change it so you can make data-driven decisions without manual inputs and adjustments.
Possible use cases/industries:
- Self-Service Data Management – enables to your people easily reuse and transform company’s data. Without proper tracking of data flows, there could be differences in reporting results and hard to argument why is it so.
- Cloud Migration – during this migration, Cloud engineers try to divide the system into smaller chunks of objects. But with this fragmentation of a large whole, the chunks may not cooperate with each other as a result. So proper data recording or mapping is necessary before any migration – you need to know any piece of your data before any changes are started.
- Large banks, insurance companies, other financial organizations – these companies are highly regulatory reported so to demonstrate data lineage or data provenance is a must have for them. Moreover, if they want to modernize their business, migrate to the cloud, use open-source technologies etc.
- Healthcare – this is another example of strictly regulated subject – by recording your data you know where data came from, who viewed it, if it was copied, received etc. You can get full details about any person at any time.
- And many others…