Augmented Data Management: Beyond the Hype

Article

Augmented Data Management: Beyond the Hype

It is time to manage your data smarter with artificial intelligence

Augmented data management is gaining more and more traction. Reports by Gartner and Deloitte identify it as a technology trend and highlight the gains that can be achieved when combining artificial intelligence (AI) and data management. According to Gartner, machine learning and automation can reduce manual data management tasks by 45 percent.

Dico Defize, Kiean Bitaraf and Niko Vermeer

Dealing with data management work piling up

Data is being increasingly recognized as an important business asset, and companies also embrace the notion that data management is pivotal to unlock that data’s full value. Investing in a clear data strategy aligned with foundational data capabilities such as data governance, data quality and metadata management has led to an uptake in data usage for many companies.

At the same time, the significant increase in data volume, variety and velocity combined with the – often excessive – urge to collect as much data as possible, now results in data management becoming more complicated and time-consuming. It can be a struggle to stay in control of your data while scaling data management efforts. Consequently, you might lag behind in providing insights in the data possessed, be unable to provide sufficient access to users and have difficulties ensuring data quality.

As a result, users of data often take the matters in their own hands when it comes to data management. One example is the many complaints we hear by data scientists who are forced to spend a significant portion of their time dealing with low-value adding tasks such as data cleansing and processing. This is not only a waste of scarce and highly paid resources, it can also lead to unsatisfied and frustrated staff.

Hiring more data stewards or data engineers may seem the straight-forward answer to data management work piling up. However, given the fact that many companies already struggle with attracting sufficient and suitable data talent, we believe a more feasible and cost-effective solution should be sought in augmented data management. Work smarter, not harder.

What is Augmented Data Management?

Augmented data management is the application of AI to enhance or automate data management tasks. It has the ability to support data talent, such as the above-mentioned data scientists, with time-consuming and data-intensive tasks which might normally be done manually. Examples are spotting anomalies in large datasets, resolving data quality issues and tracing specific data from a report back to its origin. AI models are more sophisticated and specifically designed in performing these data management tasks and often take less time, make less errors and cost less in the long run. So why not leave the heavy data management lifting to AI and allow your data talent to focus on solving high-impact business problems?

You might even be able to leverage your existing data management platforms and tools to experiment with augmented data management. Its potential is being recognized by leading data management platform vendors such as Informatica and Collibra, who are increasingly adding augmented functionalities and enabling AI-assisted decision making. We also see new players in the market focusing specifically on automating data management tasks, such as Octopai.

Based on our experience, we see most potential in applying augmented data management to support and accelerate the following capabilities and tasks:

  • Data Quality: Identifying and resolving data quality issues. Suggesting data quality rules based on existing datasets and running them. Automating ongoing data quality checks and advanced data profiling. Recognizing patterns and anomalies. Suggesting actions for data cleansing, based on predicted values and manual data cleansing.
  • Metadata Management: Labelling, classifying and searching data. Deriving the metadata model and metadata rules from datasets. Automatically collecting, organizing, cataloging and merging technical and business metadata, both for structured data and unstructured data. Generating and analyzing end-to-end data lineage to identify system dependencies, data flows and anomalies.
  • Master Data Management: Identifying and evaluating potential master data. Automatically generating a master data model, mapping data entities and configuring a MDM hub. Suggesting actions for matching and merging to establish a single source of truth, based on usage patterns, trust scores and data steward input.

An important note we have to make is that AI will never eliminate the need for truly organizing the management of data within your company. Despite the many useful applications of AI, data management remains a people’s business and the activities within will need a strong focus on change management efforts, cultural challenges and skills. Augmenting part of the work can make that journey easier, but you will still need a dedicated team spearheading the data journey.

Getting started with Augmented Data Management

Companies are all at different stages of their data management journey. Yet, we do not believe a high maturity is required to reap the benefits of augmented data management. Augmenting small tasks and experimenting with different use cases is actually a great way to further improve and accelerate.

So get started today! Consider it as a technical capability that needs to be developed. It’s better to let go of the idea that combining data management and AI is too difficult and start directly. We can easily provide you with some pointers based on our reference experiences, use-cases and in house developed maturity model on augmented data management.

An augmented data management pilot (+/- 8 weeks) could look like this:

  1. Engage with (senior) stakeholders and uncover key pain points and underlying data
  2. Quickly assess the current (augmented) data management capabilities and tooling
  3. Organize a workshop to select the most promising use case
  4. Build your first proof of concept: most often with tooling you already have
  5. Showcase your new augmented data management capability to create awareness
  6. Evaluate, improve and repeat…

Talk to us if you’re looking to bring your data management closer to task augmentation, or to get AI closer to your data management.

Did you find this useful?