Companies are increasingly seeking better insights by tapping into third-party data. Outside data can bring lots of opportunity, but using it effectively can be challenging.
Mastering data analytics is how companies avoid flying blind. Increasingly, this requires tapping into data from outside an organization's four walls. Growing numbers of companies are doing so in pursuit of an analytical edge. But using third-party data sources effectively can be very challenging. To boost the business value of their companies’ analytics efforts, leaders can adopt key practices to navigate the complexity of third-party data.
Companies know they can gain valuable insights by analyzing the data they generate from their operations. But internally generated information can leave gaps, and companies are increasingly moving to incorporate new, nontraditional, and external sources of data into their analyses. This data can include almost anything, from historical demographic and weather data to satellite imagery and private company information.
Companies increasingly operate as part of networks consisting of business partners such as suppliers, resellers, channel partners, regulators, and other stakeholders. These networks are often globally distributed and potentially affected by economic, political, and/or environmental factors. Analyzing external data can help companies see risks and opportunities that they would miss with inputs limited to data generated from internal operations, customers, and first-tier suppliers. Analyzing external data can illuminate how factors such as shifting consumer behaviors, competitor initiatives, or geopolitical events can affect a business.
As most business and technology professionals know, the volume of data being created, shared, and stored is increasing at an exponential pace. According to one study, the data stored in data centers will nearly quintuple by 2021 to reach 1.3 zettabytes globally by 2021.6 (One zettabyte is equivalent to one trillion gigabytes.) Along with the volume of data available, the potential value of analyzing this data grows bigger by the day.
It’s not surprising, then, that companies on the leading edge of data and analytics are more likely to make use of external data. An MIT Sloan Management Review report published last year found that the companies making the most innovative use of data and analytics were more likely than others to leverage more external data sources, including social, mobile, and publicly available data.7 A different study found that faster-growing companies were more likely to be planning to expand their ability to source external data than companies with lower growth rates.8
External data sources are helping businesses personalize marketing offers, improve HR decisions, gain new revenue streams by launching new products or services, enhance risk visibility and mitigation, and better anticipate shifts in demand for their products and services. For instance, a major semiconductor manufacturer used third-party data to build models that could predict the best types of customers to target in marketing campaigns. This external data helped train the models to identify potential targets that fit similar profiles to the company’s most engaged customers. These “lookalike” models helped the organization optimize marketing spend and reduced a major campaign’s cost-per-engagement by 75 percent.9
There are numerous other examples of analytics programs generating value with external data. Several startups monitor social networking data to try to predict patterns of external job-seeking behavior and retention risk; they claim their data is more predictive of an employee’s likelihood of leaving than any internal data available. Others use geolocation and weather data to predict crop yields, helping farmers optimize their use of fertilizer. Retailers are using economic data and forecasts, data from suppliers, and geolocation data to better predict demand and reduce stockouts. Some companies are using satellite imagery to estimate mall traffic and predict retail sales; others are using aerial imagery to estimate oil inventories to better underwrite loans to refiners. (For select examples, see figure 1.)
Access to external data is getting easier in some ways, but it can still be daunting. Organizations report a wide variety of business and technical challenges in deriving insights from external data.15 (Figure 2 summarizes some of these challenges.) Among the business challenges are the size and complexity of the data-provider market, which can make it hard to identify the right data sources and partners. Negotiating acquisition of data can be arduous, depending on factors such as whether ongoing access to data is needed for refreshing machine learning models, usage restrictions, whether the vendor wants a share of revenue gained from the data, and liability if the data proves to be inaccurate or tainted. This process can involve lengthy risk and legal reviews of vendor contracts and licensing agreements. The ongoing management of a growing roster of data-sharing relationships and partnerships can be taxing as well.
The technical challenges include fundamentals such as assessing data quality and accuracy: A variety of studies have demonstrated that third-party data can be riddled with inaccuracies.16 There can also be inconsistencies between external and internal data to resolve before performing an analysis. Data preprocessing such as cleansing and formatting it for analysis is time-consuming. Some estimates suggest that this can account for 80 percent of the effort in data analysis projects. And securely storing and cataloging data in an easily accessible manner can require updating information management processes and capabilities designed to handle only internal data. The longer it takes to work through these challenges, the less time available to react to market trends and external events with agility.
Research suggests that most companies haven’t yet developed the capabilities necessary to use external data effectively. To close this gap, companies may find it helpful to think of themselves as participants in a data ecosystem, which some have defined as a network of actors that directly or indirectly consume, produce, or provide data and other related resources.17
To be good at using external data means being competent in identifying, evaluating, procuring, and preparing external data in a consistent and timely manner. Companies will need a continuous process for identifying, engaging with, and evaluating new external data sources and partners and, when appropriate, integrating these data sources into analytics processes or product offerings. Maximizing the value of external data often requires integrating it with internal data for a more insightful analysis.
Companies may find it valuable to form a cross-functional group that acts as the organization’s interface to the broader data ecosystem. This group could draw on competencies from multiple areas—including product management, business analysis, data science, legal, and procurement—to address both the organizational and technical challenges mentioned above. Some organizations have created special roles charged with scanning the third-party data market and matching business requests with relevant sources, a role Gartner has described as “data curator.”18 Curators can help companies quickly identify and assess data sources matched to business needs, while reviewing external data sets for quality and accuracy using consistent evaluation processes. Companies that have an effective tech-scouting capability may look at that group for inspiration.
Organizations looking to connect to a data ecosystem can turn to a wide and growing variety of data and insights providers. Gartner Group categorizes data services, for instance, by the level of insight they provide:19
Simple data services. Data brokers collect data from multiple sources and offer it in collected and conditioned form. The data is used as additional input to a decision process by a person, an application system, or a device in an IoT ecosystem.
Smart data services. Data is enhanced by applying analytical rules and calculations. The results often take the form of scores or the tagging of objects, as in services from marketing data providers and credit ratings agencies.
Adaptive data services. Customers submit data pertaining to specific analytical requests. Providers combine that data with data from other sources.
There are other ways to segment this dynamic marketplace as well. For instance, some providers specialize in serving industry sectors such as hedge funds or health care providers. In addition, consulting and systems integration services providers are reacting to client demand for new insights from external data by leveraging publicly available data or information from third-party data partners. These providers then integrate those sources with clients’ internal data and perform custom analysis.
Companies are making growing use of data from third parties. To get more value from their data analytics efforts, companies should consider enhancing their ability to identify, evaluate, and contract for new external data through a data ecosystem management program under a chief data office that links to business, IT, and legal teams. The pressure on companies to innovate and to improve the efficiency and effectiveness of their operations is unrelenting; they cannot afford to relent in their pursuit of insights to help them do so. For many companies, effective use of external data is a critical new frontier.