TLA 2013: IT Management
"In October 2010, I walked out of what had turned into a particularly gruesome war room. We had just launched a project that we called Edmunds 2.0, which basically involved re-architecting our application around SOA principles, and we had found some problems in the production runway. We had been in the war room for days, and it was starting to smell. As I left the room that day I thought to myself, 'There has to be a better way,'" recalls Martin.
That better way is devops, which Martin spent the next three years building with the business leadership, development teams, and IT operations teams at the automobile information site Edmunds.com.
Before the devops approach was implemented, ops wasn't involved in a project until just before it was due to go live. "This meant that we often ended up with software that dev had spent months working on that wasn't going to fly in production. Ops would double the infrastructure and it would work OK, but that wasn't really good enough," Martin recalls.
"Worse, new features and apps from development often don't work as advertised, and our solution was typically to throw more infrastructure at the problem. This is not only expensive, but time-consuming," he says. There was no automation in place to help the necessary testing and deployment work. As a result, dev and ops spent a lot of time fighting in war rooms, rather than delivering working systems.
"What we needed was better alignment of dev and ops -- in short, devops. It wasn't difficult to justify this project to our leadership, but it was difficult to effect change in an organization where dev and ops are basically pitted against each other: Dev wants to release lots of features very quickly, and ops wants to minimize change to keep the app stable," Martin says.
Martin addressed that challenge by creating an intermediary team that he led to unite the tooling and processes that dev and ops use, such as deploying Chef for configuration management and application deployment, and AppDynamics and Splunk for app deployment, as well as using common QA tools across dev and ops.
Although the use of an intermediary wasn't an ideal approach, Martin didn't believe that his older organization could handle more-drastic changes. "As usual, the technology was the simple part. Changing the culture turned out to be far more difficult. The only way to improve the performance of our site while meeting business requirements for new features was to start entering each other's worlds. Ops needs to be involved in new initiatives to provide guidance from a performance perspective, and development needs to be responsible for deploying and maintaining their code in production. Getting to this point would require everyone in the organization to start thinking differently about their roles, and to start taking on tasks that weren't in their job description," he says,
Martin decided he had to show it was a joint effort my making a developer his partner in the devops experiment, even providing him production-level access. That made Martin nervous, but he knew he had to walk the talk if he expected others to. "The best way to effect cultural change is to find your champion on the other side and to bring them into your world, and this is what we've been doing -- with great results."
From a productivity perspective, Edmunds.com spends significantly less time finding and troubleshooting performance issues in production than it did before. It spends about two fewer hours per week fixing problems, and estimates it has saved about $1.2 million through improved uptime and increased productivity by aligning dev and ops and unifying its tool set.