Government is one of the biggest producers of data—and one of the few that deliver data to the public free of charge. Governments already regulate how organizations may use personal data and myriad other issues related to data. The question, then, isn’t really whether government should get involved in the new data marketplace, but rather how it should take part.
Google “data as a currency,” and you’ll get back search results in the millions. “What if Web Users Could Sell Their Own Data?” asks a blogger for the New York Times.1 A story in Information Management highlights “Big Data Analytics: The Currency of the 21st Century Enterprise.”2 You’ll find stories heralding big data as the new currency for science, stories on the personal data marketplace, and even stories on stolen data as a currency—not to mention prominent TED talks, World Economic Forum studies, and multiple books on the subject. The gist of the argument: Personal data has an economic value that can be bought, sold, and traded.
Remarkably, one area has gone largely unexplored: the role that government will—or should—play in establishing data as a currency. Given the problems governments face in maintaining stable monetary systems, many data enthusiasts would just as soon have government stay away from this emerging instrument of exchange.Like it or not, that’s not going to happen. For one thing, government is one of the biggest producers of data—and one of the few major producers that deliver data to the public free of charge. At last count, more than 1 million data sets from governments around the world were available on the web.3Second, governments already regulate how organizations may use personal data, what privacy rights individuals have, and myriad other issues involved with the new data marketplace. If anything, regulation is likely to increase in coming years as privacy advocates and consumers step up their demands.
Lastly, revelations of the use of private data by the US intelligence community have brought the issue of co-mingling of public and private data to the forefront of public debate. While the politics are beyond the scope of this article, public consensus on the balance between privacy, security, and the flow of personal data will be critical to realize the promise the new data economy represents. Will government encourage and stimulate a vibrant exchange in this new currency, or will it just get in the way?
Government can play three principal roles in the emerging data economy: producer, consumer, and facilitator. We focus the bulk of our attention on the first two roles, with a brief take on how regulation and privacy may shape the market. But before we examine these roles, it’s important to gain a better understanding of the emerging data marketplace.
Ninety percent of the data in the world today was created in the last two years.4 Between now and 2020, the global volume of digital data is expected to multiply another 40 times or more. Much of that new information will consist of personal details: where people have been, what products they’ve bought, what movies they like, which candidates they support—the list is nearly endless.5
Companies are working hard to cash in on the market for personal data. They range from aggregator behemoths such as Rapleaf and Acxiom, which hold information on as many as 500 million consumers globally, to start-ups such as Personal.com, which helps individuals control and make use of their own personal data.6
Government is also an important player in the data economy, not just as a regulator but also as a significant provider and consumer of data.
Open data providers: Government agencies collect huge troves of data (of the non-controversial sort) in the course of doing business. Through its White House Open Data Initiatives and Challenge.gov projects, the US federal government has been releasing large government data sets to the public, free of charge. Companies and individuals use this data to create valuable products and services, doing it faster and more cheaply than government could on its own.
Data aggregators: Some marketing companies today build vast databases of consumer preferences and behaviors. If you have an email address, a firm such as Rapleaf probably knows something about you. Combining information from public records and consumer transactions, along with digital exhaust collected from social media, mobile transmissions, and other sources, these aggregators give advertisers new insights into target audiences.
Data for service: Nothing in life is free. When we use services such as Facebook, Twitter, or Google, we pay for the privilege by divulging personal information. The Facebook “nation”—now larger than many countries—grows in value with every “like,” “share,” and post.
Data protectors: To help address concerns related to privacy and personal data, the market now offers products to give individuals control over their own information. With a data locker from Personal.com, for example, you can store personal information, control access to that data, and exchange it according to your wishes. Others services, such as Reputation.com, tell you what information others are collecting about you, who’s collecting it and how they’re using it. Several firms also provide sophisticated privacy services to keep personal data anonymous.
When people discuss currencies, they tend to think of paper notes—American dollars, Japanese yen, or euros. Printed money, however, is only one kind of currency. Throughout history, currencies have appeared in many forms, from the storied stone wheels of the Yap islanders to cowries, the mollusk shells that became a popular means of exchange in China more than three millennia ago.
Currencies have evolved over time from stones and seashells to the sophisticated forms of legal tender that enable today’s global financial transactions. The evolution of the notion of currency continues today, as new, alternative currencies grow in popularity, from bitcoin to the online game World of Warcraft’s holy dust.
To understand how data fits into this evolution, we must rethink our conception of currencies. Currency is how we create and exchange economic value across geography and through time. It is anything that can serve as a medium of exchange, something that can be “cashed out” for goods and services, or used to pay debt or to store value for future use.
Data has each of these essential characteristics. Because many business transactions involve buying and selling data, it can serve as a medium of exchange—as cellist Zoë Keating noted in suggesting that instead of sending her royalties, streaming music services should provide her with data about her listeners.7
The value of data also can be measured easily. And as many of today’s most successful companies have demonstrated, data appreciates in value when translated into meaningful information. For instance, according to the Aite Group, retailers could be paying major US banks $1.7 billion a year by 2015 to send targeted discount offers to customers, based on information on shopping habits gleaned from credit card records.8
It’s an early Thursday morning, and Todd Park takes the stage at the Washington Convention Center. Park, the hyperkinetic chief technology officer for the US Department of Health and Human Services (HHS)—who has since been appointed US CTO—is in character as the nation’s “entrepreneur in chief.” The goal, he says, “is to catalyze the development of an ecosystem—an ecosystem that leverages data to improve health.”9
Eleven universities are hosting viewing parties. (That’s right, college students are gathering—in the morning—to watch a bureaucrat speak.) People worldwide are streaming the video live. “America is giving you billions and billions of dollars of data for free,” Park tells them. He means government data, like the kind that launched a $90 billion global positioning system (GPS) industry. As he closes, the audience launches into a standing ovation. Data is the new currency.
Welcome to Health Datapalooza, a celebration of the most innovative uses of health information. Having recently released troves of data, HHS is using the event to debut some of the best health care-related web and smartphone apps driven by open government data.
One app, designed by Silicon Valley-based Palantir, matches patients to clinical trials. Another, from the University of Rochester, overlays disease incidence data from the Centers for Disease Control, plus related tweets, on a map in order to track the spread of illness. A similar solution traces the path of a recent salmonella outbreak. Maya Designs has used the US Department of Agriculture (USDA) Food Environment Atlas to highlight sources of cheap vegetables in America’s “food deserts,” areas lacking supermarkets or large grocery stores.
Each program, if successful, promises to save or improve lives. Health care data could add billions to the nation’s economy, says Park, and he wants to attract more innovators to use it. As founder of successful health care management start-ups, he knows an opportunity when he sees one.
Similar Datapalooza events have focused on energy and environmental innovation, demonstrating the potential value of free government data in those sectors as well.
A generation ago, mounds of government data sat in file cabinets, tucked away from all but a few officials. At best, governments produced prepackaged statistical reports—and charged user fees for special data runs.
Not all government data is digitized yet, but a growing movement seeks to change that. Just look at what happened in the 1980s, when the government released GIS (geographic information system) data. The release fueled an industry that now includes over 30 million monthly Google Maps users, as well as a GPS market that has grown by 26 percent annually in recent years.10 GIS data has transformed daily life for many citizens, simplifying travel and saving them the time they used to spend muddling through glove compartments for maps. And GIS can be joined with complementary and cross-sector data to groundbreaking effect.
When a 2010 earthquake wreaked havoc in Haiti, for instance, responders needed maps. Soon, a crowdsourced application developed by the NGOs Ushahidi and Humanitarian Open Street Map became the default tool for search and rescue teams. More than 600 volunteers traced roads and encampments from aerial images into a computer program. They mapped data from the World Bank, Yahoo!, and Japan’s space agency. In support, the US military released P3 and GlobalHawk imagery.11
Search and rescue groups could read the resulting maps from handheld GPS units. In the evolving disaster area, crowdsourced markers identified resources such as refugee camps and cholera response centers. Multiple nations, NGOs, volunteers, and ordinary Haitian citizens came together in an unprecedented way, sharing information to save lives.
Enterprising citizens can build real-world solutions out of data. Data from sources as disparate as crime records, reports of power outages, and personal accounts of corruption tell a story to those who can translate it. The possible uses for government data far exceed what even the best government agencies can devise on their own. By making such data public, governments can tap the power of vast networks of capable groups and individuals to create public value.
Among the scores of start-ups built around the mountain of open government data is New York City-based Enigma. Originally, the company’s founders planned to build a currency trading platform. Toward that end, they started digging deeply into data from sources such as the World Bank, the Securities and Exchange Commission (SEC), United States Agency for International Development (USAID), and the Import Export Bank.
While masses of public data were available free of charge, they found that it took an incredible amount of time to acquire and manage that information. “I realized that the opportunity was no longer in trading but in providing services around the data itself,” explains Enigma co-founder Hicham Oudghiri.12
It didn’t take the founders long to drop the trading platform altogether in favor of something far more audacious. “Our goal is to become ‘the’ search and discovery platform for public data,” says CEO Jeremy Bronfman.13 A little over a year later, Enigma has brought more than 100,000 public data sets into its database. “We aim to get it all,” Bronfman says.
Enigma is just one of scores of new companies trying to convert government data into a successful business model. Energy.datamarket.com is transforming more than 10,000 open energy data sets, from sources such as the US Bureau of Transportation Statistics and the World Bank, into useful intelligence for energy companies. Hospital Register’s massive database provides access to hospital system data from 24 countries.
Helping to enable such business models are organizations such as the Sunlight Foundation and Transparency International, which have pushed governments to provide data online. At least 16 national governments have major open data initiatives. From Australia to Kenya, from Denmark to Canada, open data projects are under way at all levels of government.
To make government data more widely available in the United States, on his first day in office President Barack Obama signed the Memorandum on Transparency and Open Government. The memorandum ordered federal agencies to provide their mountains of data to the public through open application programming interfaces (APIs). An open API shares data in a format that any programmer can use and develop, paving the way for dynamic enterprises that organize public data for social good. “A new generation doesn’t see government as a problem of ossified institutions, but as a problem of collective action,” says Jennifer Pahlka, founder of Code for America.
Pahlka calls her organization a “Peace Corps for geeks.” It hires midcareer software developers and embeds them with city governments, where they use their creative skills in partnerships with city managers. One Code for America fellow in Boston noticed that homeowners shoveled snow from their sidewalks but left fire hydrants buried. This led to Boston’s “Adopt-a-Fire Hydrant” app, which allows citizens to commit to clearing snow from a fire hydrant, to keep it clear for fire department access. Because Code for America’s programs are open source, other cities have adapted the app; Honolulu uses a version of “Adopt-a-Fire-Hydrant” to keep citizens checking batteries on its tsunami warning system, Seattle to get them clearing storm drains, and Chicago to organize volunteer snow shoveling. At least five other cities are investigating uses for the app.
“This suggests how government could work better,” says Pahlka. “Not more like a private company, not more like a tech company, but more like the Internet itself. That means permission-less, open, and generative.”14
In 2009, the UK government made its first public sector information assets available as open data. At first, the initiative gave citizens and the media the chance to uncover poor performance and behavior. And while transparency remains a fundamental policy aim, open government data also has an important role to play in the British economy. It creates opportunities for entrepreneurs and data innovators to build new businesses and business models, and it also allows established businesses to add rich context to their existing proprietary data sets, strengthening decision making; uncovering cost savings; and enhancing profitability, customer experience, and consumer choice.
Deloitte UK is collaborating with Professor Nigel Shadbolt and the newly launched Open Data Institute on a program of research focused on business demand for open data. While many businesses have become hooked on “big data,” many are unaware of the potential impact and benefit of open data. Deloitte’s report, Open Data: Driving Growth, Ingenuity and Innovation, hypothesized that, despite the market being relatively immature, the quantity and quality of open data in the UK had reached the critical mass necessary to trigger a step-change in attitudes. Businesses in all industries can now find relevant open data and use it to improve their products and services. Thus, new business models are beginning to emerge: Suppliers, aggregators, developers, enrichers, and enablers. And new businesses, like Placr, ELGIN, Locatable, and Mastodon C, are delivering new products and services predicated on the insight they deliver from open data.
In Open Growth: Stimulating Demand for Open Data in the UK, we investigated the supply of and demand for open data as a first step in estimating its economic impact. Our research was based on statistics for more than 37,000 data sets from three of the largest official open data portals in the UK. The evidence suggested that consumer-driven sectors of the economy, such as real estate and retail, would benefit most from data of relevance to choices individuals make as part of their day-to-day lives. We also conducted the first ever UK-wide assessment of the market for public sector information, in conjunction with the Shakespeare Review, which is making recommendations to the UK government. Our research calculated the economic and wider social value of public sector information to the UK economy to be approximately £7.2 billion ($11 billion).
—Harvey Lewis and Haris Irshad, Deloitte United Kingdom
Ever wonder how someone breaks into television? Netflix certainly did. When company officials decided it was time for Netflix to start offering original content, they boosted their chance of success by using their most powerful tool: customer data. Combing that data to discover which producers, actors, and shows its viewers liked most, Netflix used the results to choose its first production: House of Cards.15 That series became the most popular show Netflix had ever offered. Netflix developed a successful new business strategy by using data to get into its customers’ heads. Governments are also starting to realize the transformative power data can have in better serving its citizen-customers.
The public sector is one of the largest and most diverse customer segments in the data economy. From traffic patterns to web search trends, from demographics to statistics on student achievement, governments need data of all kinds, and they spend a great deal of time and money collecting it. Unfortunately, those efforts are labor-intensive and involve massive duplication. They also tend to focus on taking data snapshots rather than tracking conditions as they evolve over time.
But now that the commercial market offers accurate consumer data in near-real time, and technology has emerged to perform sophisticated data analytics, more and more governments are likely to explore the benefits of outsourcing some of their data collection so they can concentrate instead on data analysis. Working with reliable third-party data services, government agencies may increasingly look to reduce the security and liability-related risks associated with collecting and storing data across multiple agencies.
Some government agencies are already moving in this direction. For example, the US Census Bureau buys commercial data for address verification. And to track local developments and monitor online gang activity, police departments increasingly subscribe to the Twitter firehose, gaining full access to all Twitter content as it’s published.
Transportation agencies are some of the biggest consumers of third-party data. The Virginia Department of Transportation (VDOT), for instance, uses traffic data obtained from TomTom—a vendor of GPS navigation systems for consumers—to predict traffic jams on the I-95/I-64 corridor.16 By accessing data from millions of cars in the region, VDOT quickly gets the information it needs without deploying roadside sensors.
There are both potential risks and benefits arising from government becoming a bigger consumer of third-party data. The risks are obvious: breaches of privacy and a deep distrust amongst citizens themselves about how governments may use commercial data. Complex and impenetrable privacy and user agreements, and a history of data breaches, among other things, in the private and public sectors have contributed to this uneasiness. Citizens lack a clear picture of what is being collected about them, by whom, or to what end. As privacy norms and practices are codified, government will be responsible for implementing a consensus privacy infrastructure, not just as a regulator, but also as a market participant.
With all the challenges, it’s easy to lose sight of the substantial benefits that also exist. One of the biggest advantages governments can gain from commercial data is having a better picture of trends among target populations. That’s what the US Department of Health and Human Services (HHS) gained in 2012 when it launched a competition to find an efficient way to monitor emerging health trends through social media.17 The winner, a start-up firm called Social Health Insights, LLC, produced a web-based tool called MappyHealth that predicts and monitors disease trends by analyzing tweets in real time.18
MappyHealth analyzes groups of 1,000 or more tweets on the same topic and in the same area—based on keyword matching and location data—in aggregate.19 The US Centers for Disease Control and Prevention (CDC) now incorporates MappyHealth data with other real-time health data, such as Google Flu Trends, to better track and predict the spread of disease.20
It’s not hard to imagine dozens of similar uses of commercial data. For instance, one could combine census data with data on consumer shopping trends to analyze public health issues, such as nutrition and obesity, and then connect the results to other health and social issues, such as localized infant mortality rates or high school dropout rates. The trick will be to strip out the personally identifiable information (PII) in order to protect privacy. While private firms use personal data for marketing and other purposes, government is often prohibited from collecting and using such information—personal Twitter feeds, for example—especially if it’s deemed sensitive under state or federal law.
Besides tapping commercial data to achieve new goals, governments might also use such data to augment—or even replace—some of their traditional data gathering activities. For instance, the US government spent $13 billion on the 2010 US Census.21 That included salaries paid to more than 565,000 temporary workers who conducted in-person interviews with millions of households. But data aggregators such as Acxiom, ChoicePoint, and Rapleaf already offer much of the information the Census Bureau needs, including demographic, lifestyle, financial, and other personal data on individual households, not to mention lists that match individuals to address, age, gender, household makeup, country of origin, and race.22
To be sure, enlisting a private aggregator to help with a census count could draw challenges focused on everything from privacy and legal constraints to issues of transparency, data accuracy, and reliability (not to mention some public uneasiness with the data aggregation industry itself). Nevertheless, as these databases become larger and more reliable (Acxiom’s database includes 126 million households and 190 million individuals in the United States alone23), it is worth exploring whether governments could conduct a more efficient and less costly census by tapping into third-party data. If nothing else, such data might be used to prefill census forms that citizens would edit as necessary.
Government agencies don’t always have to look to the private sector to save money on data. Much of the information that agencies need already resides on the servers of other government agencies. Better sharing among agencies could reduce costs by eliminating redundant data collection. A recent study by the London-based think tank Policy Exchange concluded that the United Kingdom could save more than $56 billion a year by making better use of personal data that citizens had already volunteered to various government agencies.24 By promoting better inter-agency information sharing, the UK could eliminate the national census, according to the study, creating approximately $800 in savings per citizen per year and troubling the population with far fewer data requests.25
By seeking out more opportunities to collect data from alternative sources, governments can have a positive impact on the data economy by using their prominent place in the market as a means to stimulate innovation and promote improved privacy and security standards proactively in the global data exchange.
Figure 1. Data aggregators could save the US government time and money
Data aggregators have already compiled public records, consumer transactions, and social media exhaust into databases of 200 million or more names. Most of this census form could be filled out from these databases.
Lastly, government acts as a facilitator of the data economy and does so in three distinct ways: by creating parameters, providing platforms and infrastructure for data exchange, and leading from the front.
Building infrastructure. Governments can also provide platforms to foster thriving data markets. The most audacious example of such an initiative is found in India, where the government has embarked on the largest identity management project in history. Known as Aadhar—India’s Unique Identification (UID) program—it brings to life the concept of personal data as currency by creating a unique set of biometric and demographic data points for each one of India’s 1.21 billion living citizens. The government and potentially business will be able to use the resulting database in innumerable ways, from building lender confidence to extending microfinance to remote areas to introducing new personalized health care services.26
Leading from the front. The modern data economy often looks like a “Wild West” digital environment where commercial ingenuity, rapidly changing technology, and a dearth of regulation leave many uneasy about the future. Government can help bring order and direction to this market by leading from the front—providing an example to guide other actors in the data economy. Opening as much data as possible to the public is one way to provide leadership. Another is for governments and international organizations to make data the foundation for a new kind of data philanthropy—persuading private companies with large troves of big data to donate data sets for social good—a movement that has already begun.27
As a major producer and consumer of data, and as a key player in efforts to protect personal privacy, government already occupies a crucial role in the new data economy. Recent events highlight sensitivities to the types of private sector data and how it is acquired, yet the opportunities for data as a currency exist well beyond the areas of controversy. Government initiatives will likely become even more important as the data marketplace continues to evolve. Government can use public data to help foster new commercial opportunities; use commercial data to perform its own work more efficiently and effectively; and combine public and commercial data to serve the public in ways still to be conceived. As a new form of currency, data offers the promise of new wealth for the private and public sectors alike.