Article

Strengthening our AI foundations: Getting the data right

6 min read

A journey into the often overlooked, though vital, foundations of Artificial Intelligence (AI): The costs. Most significantly, the costs of the people, processes, and tools necessary to mitigate the unforeseen risks, known as safeguarding. This article will redirect you to the need for interconnection between the rigorous practices of data governance and AI itself in order to justify your safeguarding expenses. To fully understand our ongoing investments, we must first understand the ramifications of overlooking data quality in our AI solutions.

What happens if we don’t get the data foundations right?

Both technology and specifically the use cases for AI are developing at an unprecedented rate, at which many organisations are becoming active adopters, understanding the business value. This has followed the awareness of the significance of data that has increased over the last few years. However, many organisations have not kept up with the pace of change in regard to their data and how to derive valuable business insights – i.e. they are lacking a clear data strategy and/or the ability to execute on it.

The below examples depict how bias can intercept AI solutions, leading to stereotypes, discrimination, and prejudiced outcomes.

Biases can stem from several sources, ranging from the type of AI/Machine Learning (ML) model chosen to researcher bias, but one of the most significant causes stem from the selection of underlying data. Data is the fundamental building block in every solution – if we overlook that then everything can fall apart.

What regulatory measures are in place to support in the prevention of bias fuelled prejudice in our AI solutions?

While explicit AI-specific regulation is still albeit limited, existing laws and regulation that concern data protection, consumer rights and competition still apply in the context of the advantages offered by AI. It is widely expected that new legislation specifically regulating the commercial use of AI will be proposed in the coming months. Recent publications by European Union (EU) policymakers and lawmakers hint to the exact areas of AI that businesses can expect such new laws to govern (in particular ethics and internal controls/measures).

The European Commission (EC) have recently published a whitepaper that seeks to legally define what constitutes AI, as well as exploring the benefits and opportunities it can provide in various fields. In addition to this, the whitepaper looks to identify potential risks and the ramifications because of misuse of AI.

Where there are data regulations, data governance is often used to adhere, but it is not necessary to wait for regulation before putting it in place.

Bridging the gap between AI safeguarding and Data Governance

Data governance ensures data which is fit for purpose and is of high-quality data as measured against key data quality dimensions (accuracy, completeness, timeliness, consistency etc.). In addition, data governance gives organisations better transparency and visibility over where their data resides, which allows for greater connectivity, smoother integration of applications and future scalability. From an AI perspective, it allows us greater confidence in identifying data which is “fit for AI processing and techniques”. Further, the relationship between AI and data governance Is both two-way and dynamic. As data governance supports better AI, so can AI support better data governance by bridging human-led governance activities (ownership/accountability) and automating technical governance activities (classification/controls).

The key message here is that AI can only offer sustainable, long term value for organisations if they have strong data governance foundations. As the evolving complexity and commercial applications of AI continue to widen, organisations will be faced with unforeseen challenges concerning AI ethics.

Considering the importance of getting our data governance foundations right, what’s the best approach in achieving this?

  1. Establish a holistic business case.

    AI goes above and beyond to enhance the potential benefits offered by proper data governance. Safeguarding is an ongoing, operational cost of a business which should be clearly depicted in your business case, alongside any assumptions and tangible benefits which link back to your business objectives. An easy way to achieve this is by first defining the AI use cases, then articulating these applications in a business context. Examples of use cases include:

  2. Return on Investment – More than Monetary value

    AI will naturally incur further capital and operational expenditures, which must be mapped against returns on investment to quantify value.

    Return On Investment (ROI) is not solely quantitative; we must also consider its value beyond monetary terms. When discussing AI safeguarding from a quantitative perspective, one benefit is the mitigated risk of regulatory fines. Organisations have a duty to ensure the safe and ethical use of AI, which will in turn impact future investments and growth opportunities. Thus, a qualitative dimension to consider is avoiding reputational damage which could lead to loss of market share and customers. Investing in mitigating data and AI risks will prove to be a benefit and competitive advantage for organisations who are proactive in their approach to data and AI.


    The returns on investment from investing in AI safeguards are characterised by the qualitative/quantitative benefits as well as the very drivers that necessitate a data governance framework:

    Managing data-risk and understanding how data risk relates to other forms of risk.

    Data governance is a key component of risk management and business continuity planning as it helps an organisation to shield itself from risk and vulnerabilities across several areas (e.g. privacy, security, regulatory and legal compliance). The primary goal of risk management is to ensure that all possible risks are addressed, and preventative measures are in place.

    Data governance and AI allows an organisation to manage the risks it faces and prevent incurring of further expenditure or damage, i.e. through regulatory fines, loss of actual/prospective clients, loss of profit, loss of market share.

    Reducing costs and managing mandatory cost centres.

    Mandatory costs can be decreased, and wasteful expenditure may even be eliminated entirely. In the short term, more safeguarding will increase costs however it will allow your AI to scale and become far more effective at reducing enterprise costs in the medium term. Data governance and AI work together to improve ongoing resource management and allocation, allowing an organisation to focus its efforts on process efficiencies and extracting the most value out of its operations.

    Increasing revenue and improving profit margins.

    This is a fundamental and natural business objective of any Commercial enterprise. It is reasonable to assert that this cannot be achieved without significant investment. Though short-term revenue and profit is likely to be minimal, over the longer term the investment will begin to show key financial benefits. Managing data governance enhances the scale of opportunities AI can present as it maintains a focus on continual data-driven business performance. This can be proven or measured using KPIs relevant to the business e.g. turnaround/lead/response time, engagement levels, net promoter feedback or customer retention/attrition rates.

  3. Tooling enabled change for a wider impact

    Safeguarding can be made simpler through the embedding of tools into everyday work practices, alleviating the role of people in utilising and managing growing volumes of data. Using tooling has allowed organisations to embrace the measurable benefits from data governance, with applications to AI safeguarding including:
    1. Central repositories – Leading organisations have invested in tools with capabilities to add engagement layers which visualises key systems, datasets, algorithms and bots in the enterprise.
    2. Automated workflow – Intensify visibility in the data pipeline, such as triggering an event when sensitive datasets are accessed which can link back to the data owner.

Data governance tooling vendors such as Collibra and Informatica can help enable organisations to implement a more automated, standardised, and centralised form of data governance. Such tooling constructs the best practices and codifies data conduct and embraces AI in its approach, further connecting the two disciplines for enterprise-wide benefit.

Close the gap between your data governance teams and your AI. It is the dependency on good quality data which enables AI to function and produce the amazing insights we see rising in the industry today. Developing an AI solution without faults is not an easy or realistic task, but with smart investments into rigorous safeguarding practices, your team will have the resilience to fight back and strengthen your learning technologies. A robust data strategy which is executable ensures that underlying data within an organisation is being governed in line with best practices.

Fullwidth SCC. Do not delete! This box/component contains JavaScript that is needed on this page. This message will not be visible when page is activated.

Did you find this useful?