Docker turned two last week. The press was invited to the company's small, no-frills offices in San Francisco for lunch and a birthday party, but what struck me was who else was in attendance: Senior cloud execs from IBM and Microsoft, who were there to remind everyone they supported Docker on their respective cloud platforms.
I remember the days when big companies would invite little ones to events as a sort of official blessing. But here, the big guys were basking in the glow of the hippest enterprise startup around, as if to demonstrate they knew what was happening.
What's happening is an unbelievable uptake among developers, with over 100 million downloads of Docker Engine, which enables developers to package applications and deploy them in Linux containers with ease (and Windows containers, whenever the next version of Windows Server ships). The portability is awesome, but as you may know containers are also much more lightweight than VMs, enabling vastly improved hardware utilization.
Docker sales reps have not blanketed the earth encouraging enterprise management to give the new technology a spin. This is viral open source adoption at lightning speed: Developers immediately saw the benefit and snapped it up. Now, not only IBM and Microsoft, but also Amazon, Google, Red Hat, Rackspace, and just about anyone else with a platform supports Docker.
A reporter at the Docker event asked the classic question: So what's all this good for, anyway? Solomon Hykes, Docker founder and CTO, answered the only way he could: A new generation of distributed, connected applications we haven't dreamed of -- though he added that large service providers (such as Google) have been delivering Internet services using containers since before Docker was born.
Why is it that Google chose to use containers instead of conventional virtualization? Miles Ward, global head of solutions for Google, offered an answer in a January 2015 blog post. Back in 2006, Google actually began developing the core Linux container technology on which Docker is based:
Why virtualize an entire machine when you only need a tiny part of one? Google confronted this problem early. Driven by the need to develop software faster, cheaper, and operate at a scale never seen before, we created a higher level of abstraction enabling finer grained control. We developed an addition to the Linux kernel called cgroups, which we used to build an isolated execution context called a container: a kind of virtualized, simplified OS which we used to power all of Google’s applications. Fast-forward a few years and now the fine folks at Docker have leveraged this technology to create an interoperable format for containerized applications.
A couple of years earlier than that, Sun's Solaris operating system added containers in the form of Zones. In a presentation at a Docker Seattle meetup, Joyent CTO Bryan Cantrill, who worked on the Solaris kernel at Sun, offered a colorful critique of conventional virtualization vs. containerization:
It is horrifying how humanity's precious energy has been diverted to making hardware virtualization actually perform, and the number of kittens that need to be slaughtered every time you perform I/O in the cloud -- it's disgusting, actually. It's horrifying. It's amazing that it works ever, at all, let alone once -- and then you are like, "but it doesn't perform very well." It doesn't perform very well! Are you kidding? It's amazing that it works. But … oh my God, you should not do it this way.
Note Cantrill's emphasis on I/O, which points to a potentially large performance advantage containers have over conventional VMs. Running on SmartOS -- Joyent's cloud adaptation of Solaris -- Cantrill says I/O-intensive operations using Postgres are 14 times faster using containers than using VMs. At the Docker event, Angel Diaz, IBM's vice president of cloud architecture and technology, told me he was seeing similar multiples running Docker on bare metal in the lab.
Containers use a fraction of the resources VMs use. As Ward notes, a big part of the appeal of Docker to developers is that they can easily run multiple containers on a laptop as they code and test, which is not exactly practical with VMs. Also, application deployment is vastly easier with containers -- and applications boot instantly, as opposed to the minute or so it takes a VM to boot.
Given all these advantages, you might wonder how long it will be before containers start supplanting VMs in production. The answer is that -- outside of giant service providers such as Google -- it's going to be a while, for reasons I noted in a previous post. Container security needs to be beefed up. Container management and orchestration needs to go beyond what Kubernetes, Mesos, and Swarm offer today. Plus, a whole new generation of ops folks needs to emerge that understands how to manage this new infrastructure layer at scale.
Meanwhile, we'll be in an awkward interim phase, where Docker will be run on top of guest OSes inside of VMs, mainly for the portability benefits alone. Cantrill says "that is madness, because OS virtualization is that next step function. It allows us to eliminate this layer of fat that is buying us nothing."
I have little doubt that Cantrill is right: What he calls "OS virtualization," another name for containerization, really is the next step function. Yet there's a ton of sunk cost in conventional virtualization infrastructure, and Docker is just two years old. This revolution has already proven itself much quicker than others, but nothing in enterprise technology happens overnight.