Posted: 30 Aug. 2022 13 min. read

Behind the crystal ball: Realizing value from predictive retention projects

Key challenges organizations face to build retention models

By Eric Lesser and Gary Parilis

Anyone who has had their flight delayed, a medical appointment postponed, or a long wait on hold for a customer service representative is aware of the growing talent shortages facing many industries today. These shortfalls are resulting in a host of important consequences, ranging from irritated loyal customers to impeded supply chains to reduced top-line revenue. Recently, we have seen many of our clients express significant interest in developing predictive models to anticipate and address employee attrition and look for ways to stay ahead of their competitors through more effective talent retention.

When we work with clients on these types of projects, among the typical questions they are looking to address are:

  • To what extent is my organization at risk for attrition within key segments of our workforce now and in the future?
  • What are the factors that are influencing attrition/retention?
  • How can I identify and prioritize interventions that could reduce the potential turnover within key workforce segments?
  • How can I determine which interventions will pay off and how much will I save?

Human resources (HR) executives are looking to predictive analytics and machine learning algorithms to address these and similar questions. Increasingly, companies are investing in technologies to bring together disparate sources of workforce-related data to create insights and develop predictive attrition risk models.

However, we often see these organizations encountering challenges in designing and assembling these models and using them to act within their organization. An analysis that is effective and leads to initiatives resulting in business improvement requires careful consideration of a variety of factors and adherence to a number of practices. 

These considerations involve activities ranging from selecting, extracting, transforming, cleansing, and analyzing data to building and interpreting predictive models to translating these results into tangible guidance that can be used by people managers.

The concerns that we see fall into three overall categories:

  • Sense: Identifying the internal and external trends that are most relevant to the business to explore in greater detail
  • Analyze: Using the data to formulate insights and build predictive models and determine attrition drivers
  • Act: Making the results come to life


A successful attrition analytics project begins with a clear approach and project structure and adherence to critical steps that must occur early in the project life cycle:

  • Use external market/industry trends, stakeholder insights, and voice of the employee to identify priority issues and opportunities to explore. Many predictive retention efforts start with a “one size fits all” approach, where a standard set of factors are identified and applied to the model. While these variables are usually necessary, the true insights often lie in data that is specific to the position, the organization, the overall industry, and the local labor market. As an example, for some positions, perceptions of the impact of one’s job and the organization’s sense of purpose play a critical role in individual’s desire to stay with an organization; for others, access to advancement may be an important driver. The local economy and job market also play an important role, not just in the number of employees at risk, but also in the factors that lead them to leave. Organizations should invest significant time up-front reviewing existing external labor market trends and results from employee listening programs to identify those variables that are likely to have the most impact in the model. 
  • Focus on data that is both actionable and relevant to addressing the problem, not just that which is easy to obtain. Organizations will often rely solely on engagement surveys or only use workforce data (such as demographics and compensation) that is easily accessed through a traditional human capital management platform. While these are fundamental inputs, the data that will truly provide insight often resides within other systems. For example, work-life balance and burnout insights can come from obtaining shift behavior and overtime patterns in labor scheduling systems, nights away from home from travel records, and after-hours activity from calendars. Quality and safety systems can indicate potential areas where employees may be under duress. Learning systems provide insights on development opportunities taken advantage of, and performance management and skills tracking systems can help to assess job fit. Further, including non-HR data, such as financial results, can help assess the impact of attrition and remediation on business performance. In any situation, retention projects should develop a set of up-front hypotheses, which then point to the data that should be collected to test its relevance in making predictions.
  • Secure stakeholder commitment from the top down. Retention analytics may seem to be a straightforward analytic exercise, but it requires commitment from senior leadership to address the resource and governance issues that often arise during the project. First, senior leaders need to agree on the scope of the project and the resources to support it. These resources often must come from different departments within HR (such as HR analytics and organizational development), but also require time and effort from individuals outside of HR. These areas include information technology (especially data owners), Legal and Risk, and people leaders responsible for the roles being examined. Ground rules for how the data will be obtained, applied, and safeguarded also must be agreed upon and communicated. Finally, there needs to be a commitment to act from multiple departments, as recommendations related to retention often cross traditional organizational boundaries.


A good deal of planning and strategic thinking is necessary before proceeding with the model building. Among the important practices we see at this step of the project include:

  • Provide adequate time in the schedule for up-front data acquisition activities. During retention projects, unexpected process and delivery challenges involving data acquisition and preparation are very common. Once the source systems and their owners have been identified, the data requirements must be communicated clearly to those providing the data. Instructions must include both the specific data elements and details of the formatting requirements. Miscommunicating these requirements often results in lost time and may lead to additional programming work. Time also needs to be built in after data delivery for the data science team to query the data stewards on nuances within the data. Up-front planning and diligence can help prevent these issues, but it is wise to plan conservatively on the project schedule, regardless.
  • Conduct a thorough data examination before moving on to modeling. Exploratory data analysis and visualization lead to a depth of understanding that is required to properly design a retention model. These tasks inform data engineering decisions that can identify nonobvious relationships that might have a dramatic impact on model results. For example, there may be an optimal cadence of supervisory meetings to minimize turnover; too few meetings indicating insufficient attention and too many reflecting micro-management. Applying a linear model to this variable would miss this “Goldilocks” relationship. Additionally, initial exploratory analysis may expose outliers or biases in the data that require additional transformations or, at the very least, additional communication with stakeholders. A collateral benefit of this initial data exploration is that it gives business leaders an early view of the results and can generate additional excitement among stakeholders.
  • Engage subject matter experts (SMEs) and stakeholders iteratively during the model development process. Data scientists are rarely domain experts and will typically communicate results at face value. Results may contradict what business stakeholders already know (or think they know). Periodic engagement with SMEs can stimulate deeper investigation to make sure there is not a modeling or data issue. This does not mean the model is wrong; in fact, we often hope the analysis will yield surprises. But when it does, it is necessary to validate them before accepting them as final. Also, interesting findings almost always lead SMEs to ask follow-up questions, leading to deeper, more meaningful analyses.
  • Maintain, refresh, and fine-tune models on an ongoing basis. A predictive model has great value at the outset—measuring risk, understanding drivers, and stimulating remedial action. But its utility diminishes over time and models need a maintenance process to provide continuous impact on the business. This comes in two main forms. First, current data should be ingested at regular intervals to refresh the output, both at an individual employee level and in aggregate. This will accommodate changes in each employee’s circumstances (e.g., schedule changes, pay raises, job changes, adjustments in colocation requirements), as well as external market factors. Second, the models themselves should be retrained periodically, as the relative importance of drivers are often not static (especially during a time of such social, economic, and political change).


Once the predictive model is developed, additional work needs to turn the insights into actions that can impact the business. Key steps at this stage include:

  • Develop stories to distill the key messages that drive executives’ engagement and eagerness to act is a critical success factor. The best predictive model has no value, unless it drives action and action requires organizational buy-in, which typically starts at the top. This requires focus on concise and persuasive messages based on credible results, validation from SMEs and easily digested visualizations that make the findings come alive and suggest action. 
  • Drive actions from business realities as well as the model output. While the quantified risks and drivers from the model are fundamental to deciding on actions, effective decision-making requires business judgment and analysis. Beyond measures of what factors affect retention, the feasibility and cost of initiatives must be considered. For example, improving managerial effectiveness may be a leading retention driver, but it may require costly interventions with uncertain outcomes. On the other hand, compensation may be a less influential driver, but is much simpler to address, and with more easily measurable costs. It is also important to consider the potential for unintended or adverse impacts of interventions on certain classes. For example, recommending a certain set of schools to recruit individuals from might conflict with an organization’s desire to attract a diverse pool of candidates.
  • Project investments and impacts of potential initiatives as an input to decision-making. One of the advantages of a predictive model is that it can make “what if” simulations possible, allowing decision-makers to project the reduction in attrition (and the corresponding cost savings) associated with a given combination of interventions. If the model indicates compensation, managerial turnover, and overtime hours are important drivers, simulations can be used to estimate the number of employees that would be saved with various degrees of intervention addressing combinations of these drivers. Consideration of the cost, feasibility, and effectiveness of those interventions will make decisions more effective.

Next Steps

Predictive modeling can be a powerful tool to provide intelligence that can help manage the risks of turnover and associated costs. However, undertaking these types of projects requires more than an understanding of the mechanics of artificial intelligence and machine learning. It requires consideration of the context of the roles being examined, the questions that need to be addressed, and the ability to translate the findings into actionable outcomes. If your organization is looking to undertake such efforts, consider the following questions:

  • What roles or segments of the workforce do I want to focus on and why is retention important to the organization’s strategic direction?
  • Do I have a clear set of hypotheses that are guiding my data collection and analysis?
  • Have I identified the most relevant data and addressed the concerns of the owners of the data?
  • Have I set out a process of cleaning and transforming my data once I have collected it?
  • Do I have a process for engaging SMEs throughout the model design, development, and interpretation?
  • Have you developed a clear change management strategy at the beginning of the project that addresses stakeholder management and communications to ensure follow through and actions?

Today’s employers face historic challenges, with firms across all industries scrambling to fill critical workforce shortages, while struggling to retain the talent that is already onboard. Even seemingly thoughtful approaches toward understanding attrition drivers can be fraught with problems. Attrition analytics projects require diligent planning and grounding in a solid understanding of the market dynamics surrounding the organization, a thoughtful analytic process that takes advantage of the right data and expert involvement, and engaging communication of results that emphasize a quantification of the results projected from recommended actions. These factors can be the difference between a retention strategy fails to act upon the true drivers of attrition or results worst and one that drives retention, saves money, and minimizes disruption in the workforce.


Join the conversation