Mesosphere, a startup that provides commercial support for the Apache Mesos cluster management system, has debuted a "data center operating system."
Mesosphere DCOS uses the Mesos project to gang together machines running Linux, whether hosted in any number of clouds (Amazon, Google, Microsoft) or running on nearly any kind of infrastructure (bare metal, OpenStack, VMware).
Widely deployed at scale by companies like Twitter and Airbnb, Mesos has a proven track record. However, Mesosphere DCOS is designed to manage not only the applications but also the systems they run on.
Mesosphere DCOS's core innovation is the ability to deploy and manage application workloads across multiple machines without requiring more than a few terse command-line statements. Hadoop or a Ruby on Rails app can be deployed automatically across nodes and scaled up or down to meet demand and ensure that nodes don't go underutilized.
Unlike the CoreOS model, DCOS doesn't consist of a Linux distribution built along custom lines to run containers. Rather, DCOS manages existing Linux installations, which might be more immediately appealing to architects of existing data centers. Another bonus: DCOS uses its own application initialization system (Marathon) and works with Kubernetes, for those who already have or are planning an investment in Google's orchestration framework.
Most applications launched by DCOS are deployed in a prepackaged format, with many common apps that run at scale available through a central repository. Mesosphere also provides an SDK for building apps that hook directly into DCOS, though some existing apps, such as HDFS, are already Mesos-aware and require no additional work to scale under DCOS.
In an O'Reilly Radar article entitled "Why the data center needs an operating system," Benjamin Hindman, co-creator of Apache Mesos and now chief architect at Mesosphere, laid out his rationale for what he described as "the POSIX for distributed computing," or an API for distributed systems running in the cloud or in a data center. "Machines," Hindman wrote, "are the wrong level of abstraction for building and running distributed applications."
In his view, resources go wasted because apps tend to be deployed one to a machine, resulting in underutilized hardware (8 to 15 percent efficiency, according to Hindman). Worse, bottlenecks spring up around the individual machines and the "armies of people" needed to maintain them.