Where site reliability engineering meets devops

Site reliability engineering brings agile methodology to operations. Clarify the responsibilities of the SRE and devops roles to keep things running smoothly

Where site reliability engineering meets devops
PeopleImages / Getty Images

Back in the days before cloud applications, devops practices, test automation, and site reliability engineers, we had developers, testers, and system administrators developing and supporting Web and mobile applications. Developers followed agile methodologies, whereas system administrations often adopted ITIL’s incident management and other practices.

We had fewer tools to automate testing, deployment, and infrastructure in those days, so there was much toil going from code-complete to production-ready. Monitoring production infrastructure and applications and discovering root causes of production issues required both craft and skill because operational data, monitoring tools, and support workflows did not easily integrate.

In many ways, developing, testing, and supporting applications is somewhat easier today, but the terminology, role definitions, and practical responsibilities are much harder to decipher and apply. Is site reliability engineering part of devops or a complementary service? Who is responsible for implementing CI/CD (continuous integration/continuous delivery) pipelines and infrastructure as code? When there is a production incident, what’s the most efficient process to resolve the issue, discover the root cause, and implement the optimal remediations?

Google’s culture and practices may not work at your organization

The answers cannot be universally applied because of differences in company size, scale, and complexity. What works for a startup with a few dozen engineers does not work for geographically dispersed enterprises operating in regulated industries. Similarly, the culture, practices, and technologies that work well for large-scale technology companies such as Google, Netflix, or Microsoft are often not achievable in other industries or businesses with more legacy systems and technical debt.

To continue reading this article register now