Defensive data priorities may be more important than ever
Government CDOs are stewards of the public’s data. Therefore, the prime motives for defensive priorities may be both the expectations of citizens and the requirements of regulations. Citizens expect data to be usable and their private information shared with government to be kept secure and private. Regulations can play an important role in ensuring the security and privacy of data, but they can also be a complication for CDOs. There is no single principal data legislation in the United States. Instead, there are many federal and state laws to protect specific types of data, such as the California Consumer Privacy Act, which protects certain digital data of California residents, or the Gramm-Leach-Bliley Act, which protects certain types of financial information.2 This patchwork regulatory environment places the onus on a CDO to understand the various data laws constraining the use of specific types of data.
But protecting data from unauthorized disclosure is just the groundwork of defensive priorities. If data is to have mission value, it should be clean, usable, and available. Organizations that encounter difficulty with these tasks often have issues with metadata restricting the findability of data, or they lack the APIs and data standards necessary to share data widely across the organization. So, while defensive priorities start with security and privacy, they quickly move into data governance—that is, trying to answer questions such as “how can an organization link data from very different sources, with very different structures and metadata, without compromising data integrity?”
Challenges to useful data
Questions about how to combine data from different sources without compromising integrity inevitably reveal the significant challenges facing CDO’s defensive priorities.
The heterogeneity and high volume of modern data likely mean that ensuring data quality can be quite difficult. As government organizations gather more data, especially data about the real world, it could become increasingly heterogenous. The data may neither be complete nor have structured, well-annotated metadata. As a result, ensuring data quality may not be a one and done exercise, but will likely take many, repeated steps.
Issues with data integrity and privacy can be exacerbated by traits of an organization’s underlying technology. In many cases, government organizations are working with older, legacy systems that may store data in proprietary systems with no common access, or they may present unique vulnerabilities that are difficult to secure. Even for organizations with newer technology, if that technology is not properly configured or monitored, cyberattacks can pose significant risk of data loss or corruption. Ultimately, the fact that technical systems may be the remit of chief technical officers (CTOs) or mission leaders, and outside the direct control of CDOs, means that they can pose significant difficulties in enabling data-sharing or reuse.
But merely modernizing systems and using cutting-edge tools may not be a guarantor of success. For example, even modern artificial intelligence (AI) and machine learning systems can present unique vulnerabilities that adversaries and criminals can exploit. More than focusing on the age of systems, CDOs should think in terms of implementing data governance across the life cycle of their data, from creation through preparation and from storage to use.
Since the use of data can exist outside of the CDO’s office and with mission groups, this could mean that an organization’s culture may also present challenges for CDOs. Users may have certain expectations of usability that can clash with needs for data privacy, security, and quality. If business processes are too lax, poor quality data or even data breaches can be likely. But, if business processes are perceived as overly strict, it can lead to workarounds and further problems. Finding the right balance for the organization is important to protecting data while also keeping it available enough to create mission value.
Ultimately, CDOs should have the resources to help address the challenges discussed so far. New technologies and highly skilled staff cost money. Without the funding to bring in the right tools and staff, the leading plans for defensive priorities may be difficult to implement.
Building data defenses
The above challenges are not just technical. Rather they live at the intersection of technology and the organization. Therefore, overcoming these challenges and maintaining data that is findable, accessible, interoperable, and reusable could take both technical and organizational solutions3:
Technical solutions:
Modernize older technology. This does not necessarily mean ripping and replacing many data-handling systems. Rather, it means thinking in terms of a data platform. A data platform is the technology stack needed to discover, process, store, analyze, and secure data. In many organizations, many of the tools needed for a data platform already exist, but may not be managed together, creating potential problems for the security, accuracy, or availability of data (for more on data platforms, see the article, Organizing to drive change in this series).
Reduce risk with new analytical techniques. New techniques for analyzing data can help reduce the risk to security and privacy, lessening the CDO’s defensive challenges. For example, synthetic data and federated learning can reduce the need for protecting data by reducing its sensitivity. Synthetic data creates new data sources that retain key information found in the original data but without any of the personal information that should be so highly protected.4 Federated learning takes the opposite approach by retaining the accuracy of the data without moving it. Rather than moving data to train a central machine learning model, the model is moved to where the data is stored, removing much of the vulnerabilities that can come with moving sensitive data from location to location.5
Organizational solutions:
Data literacy. Organizational culture can be hard to change, especially from the scope of a CDO. However, building data literacy can help the whole organization begin to see the world through the eyes of the CDO. By doing this, workers can not only see their own mission needs, but they can also begin to see the value in protecting data and the risks from not doing so. Therefore, data literacy programs can be an important tool in overcoming cultural challenges and striking the right balance between secure and available data.
Tie data governance to mission priorities. To address funding challenges, CDOs should encourage an organization to see data as a strategic asset. To continually ask “how do we maximize data value at this organization?” should become an ongoing focus. To find and value data is to encourage increased funding for robust data governance as an investment in the organization’s mission. CDOs should work to protect, structure, and utilize data correctly. Machine learning may be used to help efficiently structure and prep data for AI-readiness, including mitigating bias and strengthening trustworthiness. As an organization values data more, they can allocate more resources to maintain and improve it.
Getting started
Defending the security and accuracy of an organization’s data can seem a daunting task. But it can be a critical one: Without defensive priorities it could be difficult to use data to create any real mission value. But CDOs do not need to do it all in a day. By breaking down the above recommendations into three categories, they can make iterative progress on each.
Find the right people. To begin, a CDO should consider building the right defensive team within their office. Building teams of data scientists, applied mathematicians, and data engineers who have experience in these domains is an important first step to protecting an organization’s data. CDOs should also liaise with other executives to confirm they have the right funding, tools, and authorities to operate. Developing data governance KPIs can help measure the strength of data security and quality and communicate its value to other leaders. After all, like an illness prevented by a doctor’s visit, it can be hard to quantify the value of defensive measures.
Design the right tech. Finally, CDOs should work with chief information officers and CTOs to examine the organization’s data architecture from the perspective of resilience. By understanding how a cascading failure in any digital network can flow from one system to another, tech leaders can work together to help create robust systems that can be resilient in the face of failures.
Execute the right process. Getting defense priorities right is not one action, but a continuous series of actions. Making clean data available to who needs it, when they need it, takes implanting the right process controls across the data life cycle. This can include cleaning data upon acquisition, but also checking that algorithms are working as designed when data is in use, and finally, appropriately disposing of sensitive data once it is no longer needed.
It’s often said that offense wins games while defense wins championships. That sports metaphor may be a bit stretched in the data world. But if government organizations want to be the champions of their mission space, they certainly should play data defense, and play it well.