Effective Site Reliability Engineering (SRE) Requires an Observability Strategy | Deloitte US has been saved
Site reliability is a growing concern in the marketplace. A key priority is to make sure you have a sound observability strategy, because observability empowers SREs and production resilience. Regardless of whether your IT infrastructure is hosted on-premises, in cloud, or hybrid, having a robust observability capability to provide a 360-degree view of your applications, databases, and infrastructure health is key to avoiding major issues and making informed decisions. Your SRE journey must include a sound observability road map.
Cloud migration is forcing enterprises to reevaluate their site reliability practices and investments. Reliability and resilience are more difficult in cloud, and many enterprises have realized that their lift-and-shift cloud migration tactics have hampered their ability to sustain—much less improve—their total availability metrics and performance objectives.
Trends that are forcing organizations to enhance their site reliability include:
Why are monitoring and observability core to effective reliability and resilience engineering?
The traditional way of using software metrics and monitoring is not sufficient in today's computing environment. This approach is reactive; it may have served the industry well in the past, but modern systems demand a better method.
Observability tools were born out of sheer necessity when traditional tools and debugging methods could not identify what software did in production.
For a software application to have observability, you must measure how well the internal states of a system can be inferred from its external data and information outputs.
Observability requires the collection and analysis of the following data from your software:
What does observability deliver for SRE?
Effective observability can help to identify some of these issues:
We see our clients making observability the starting point in their SRE journey. The essential components of effective observability include:
Practicing observability and SRE together improves reliability. Your observability system can expose what is happening with software running in the environment and inform your SREs they are improving overall service level objectives.
As the chief cloud strategy officer for Deloitte Consulting LLP, David is responsible for building innovative technologies that help clients operate more efficiently while delivering strategies that enable them to disrupt their markets. David is widely respected as a visionary in cloud computing—he was recently named the number one cloud influencer in a report by Apollo Research. For more than 20 years, he has inspired corporations and start-ups to innovate and use resources more productively. As the author of more than 13 books and 5,000 articles, David’s thought leadership has appeared in InfoWorld, Wall Street Journal, Forbes, NPR, Gigaom, and Lynda.com. Prior to joining Deloitte, David served as senior vice president at Cloud Technology Partners, where he grew the practice into a major force in the cloud computing market. Previously, he led Blue Mountain Labs, helping organizations find value in cloud and other emerging technologies. He is a graduate of George Mason University.