Site Reliability Engineering (SRE) is a growing practice essential for enterprises to ensure service delivery, reliability, and access for users. Many companies only choose to invest in SRE when they have a raging operational fire on their hands. As a result, SREs often start out as firefighters, desperately trying to keep the service online for one more day. For other companies, investments in SRE are well underway as engineering leaders seek to optimize operations and support digital transformation initiatives.
To get a baseline on the SRE operating model and reveal what companies are doing with SRE, DevOps Institute surveyed professionals in key engineering roles in the industry’s only global view of SRE adoption in a new report—“The Global SRE Pulse 2022: The State of SRE Adoption, Deployment and Automation”.
The evolution of SRE
SRE was developed to bridge the gap between DevOps and IT. It is now a force multiplier for business. At the same time, IT professionals are influencing the direction of SRE.
Companies are adopting SRE because it improves application and service reliability for customers and business partners. In the study, respondents said the main reasons for adopting SRE is to reduce the risk of service failure and improve the ability to compete.
SRE is critical to the nature of digital-first organizations, and other enterprises are adopting it to speed digital transformation. SRE is less about technology than it is a culture change by committed and passionate people. Like DevOps, SRE is about continuous improvement—experimenting and being transparent and sharing results. There’s not one right way to do it, and there’s not just one type of SRE professional.
Jayne Groll, CEO of DevOps Institute, explains, “There are lots of different ways to adopt SRE into your business model or your IT model. There are new skills and new roles associated with SRE, and what we want to do is get the enterprise, in particular, really stoked.”
Companies are embracing SRE
The survey found that 62% of organizations are applying SRE in some capacity, and the survey uncovered many of the varied ways in which organizations are adopting SRE. Businesses are creating different structures and operating models by taking SRE’s core principles and applying them to suit their organizational needs.
It’s worth noting that enterprise leaders say they’re applying SRE principles like observability and monitoring to a large degree. An impressive 62% of organizations said they’re implementing observability tools and techniques. Overall, they’re adopting SRE by implementing automation with observability, monitoring, and incident response/performance management tools. In fact, 70% have a well-designed incident management process.
SRE practices, including tracking and managing toil are emerging as part of SRE best practices. According to the survey, about half (49%) of responding organizations said SREs dedicate time to reduce toil in some teams, 28% in several teams, 12% everywhere, and 11% not at all. Additionally, a high rate of postmortems and feedback indicates that teams are learning, growing their skills, and focusing on continuous improvement.
Opportunities expand for employees with SRE skills
The biggest SRE challenge for IT companies, according to those surveyed, is the “lack of staff with necessary skill sets.” Of the survey respondents, 85% said lack of staff with necessary skills is very or somewhat challenging.
There are numerous IT roles that may be able to advance into SRE. DevOps Institute’s Global Upskilling IT 2022 report says that site reliability engineering is among the top five job titles for recent hires.
Where SRE is heading
SRE is an important initiative with advanced automation to support it. However, it’s important to recognize that the technology is there to support what is essentially a human initiative. The direction of SRE will be guided by the culture within an organization.
“Looking at the data from the survey, it's really interesting that it is considered a key operating model by the respondents, which is great,” Groll says. “SRE is an essential operating model, but how it's adopted could be different from organization to organization.”
It’s an exciting time to be involved in SRE and a good time to transition into it. The structure and guiding principles are the groundwork, while the IT people in an organization can decide where to take it and how to make it work for their company.
Read the whole report to see how your organization compares. Download it now.
Review the report highlights in this infographic. Download it here.
Complete visibility for DevSecOps
Reduce downtime and move from reactive to proactive monitoring.