Bossie Awards 2016: The best open source datacenter and cloud software

InfoWorld's top picks of the year in open source platforms, infrastructure, management, and orchestration tools

The best open source datacenter and cloud software

The best open source datacenter and cloud software

Containers, microservices, distributed clusters ... as work progresses to topple the traditional application stack, pull apart the pieces, and tie them together again through APIs, everything seems to get smaller and larger at the same time. Welcome to the new, simpler, and more complex world of infrastructure as code. And when we say "code," of course we mean open source code.

[ InfoWorld unveils the Bossies: The best open source products of the year. | The best open source applications. | The best open source networking and security software. | The best open source application development tools. | The best open source big data tools. | Stay up on open source with the InfoWorld Linux report. ]

Docker

Docker

The importance of Docker and the new generation of containerization tools that have followed in its wake is impossible to overstate. Docker is rapidly becoming a mature technology, with support in all of the major operating systems -- it's even built into Microsoft’s next server OS and VMware’s tools. It is also supported by public clouds, and it's becoming a key technology in upcoming hyperconverged infrastructure platforms.

Docker is composed of a group of interrelated open source projects. The core Docker Engine allows you to build and test images before deploying them in your datacenter or on the public cloud. Kitematic provides a UI for running containers. Other tools in the Docker stack help you deploy containers across virtual or bare-metal infrastructures: Machine handles provisioning, while Compose defines applications built from multiple containers. Finally, Swarm handles clustering and scheduling, automating the running of your applications.

Docker is a foundational technology for the modern datacenter, providing the tools, runtime, and APIs to create, host, and manage microservices. Using Docker as an endpoint for a build process and, as part of a continuous development and continuous integration strategy, allows you to simply push out a new container when you make changes. Nothing simplifies deployment and testing like Docker.

-- Simon Bisson

Kubernetes

Kubernetes

Containers allow rapid deployment of applications and services in isolated userspaces on host servers. If you want to build services quickly and run them at scale and at high density, look no further than containers. However, you’ll also need tools, like Kubernetes, to coordinate all of the moving parts, help manage service interactions, and handle scaling.

Kubernetes is perhaps best understood as a datacenter operating system. Just as Windows or Linux manages the allocation of resources in an individual machine, Kubernetes manages the lifecycle of containerized services and applications across a cluster of machines. It allows you to group services into logical applications and deploy them across a datacenter, ensuring that resources are made available as needed and workload utilization is kept as high as possible.

If you’re running containers at scale, you’re going to need something like Kubernetes. By managing IP addresses and DNS, Kubernetes also gives you a route to handling service discovery, as well as helping to balance the load across services as an application scales. Kubernetes will also help to heal applications, restarting services as necessary and moving workloads away from overloaded servers and hosts. You’ll find Kubernetes supported on commonly used cloud platforms like AWS and Azure, as well as Google Cloud. That makes it a useful tool for moving from one cloud provider to another, giving you a portable framework for your containers and services.

-- Simon Bisson

Mesos

Mesos

Described as a “distributed systems kernel,” or a datacenter operating system, Mesos is designed to manage the resources needed to deploy cloud applications at scale, with support for tens of thousands of nodes. Mesos can work with any cloud provider and with any OS, so you can use it to manage both Linux and Windows compute instances.

With Mesos, you can set up policies that handle how resources are delivered to applications and services, with applications supported and monitored by Mesos agents that act as local schedulers. Agents report available resources on the host to the Mesos master, which offers those resources to applications. Applications can accept partial offers of resources or wait until all of the resources they need are available. One key feature of Mesos is its direct support for applications that run at scale, like Hadoop and Spark. Mesos makes it possible to build out massive analysis clusters that scale with query complexity, so you can use MapReduce and other algorithms across extremely large data sets. Because Mesos gives applications control over scheduling, it helps to ensure that processing tasks are sent to the hosts where the required data resides.

Thus Mesos’s application-centric approach to using compute, network, and storage resources allows big data clusters to operate more efficiently. Jobs can be started with few resources and grown as more become available, or they can be delayed until the precise resources they need become available, taking advantage of data locality rather than fetching data between hosts and adding to network traffic in a public or a private cloud.

-- Simon Bisson

CoreOS

CoreOS

Lightweight operating systems are an essential component in any cloud, public or private. They let you pack VMs onto hypervisors, as well as containers onto bare-metal hosts. That means a rethinking of what makes a server OS, cutting it to the minimum needed to support containers deployed from a central management platform.

CoreOS is one of the more popular options, a self-updating Linux OS that supports both Docker and Rkt containers. As a central part of the so-called GIFEE (Google Infrastructure for Everyone Else) stack, CoreOS is one of many tools inspired by Google's own infrastructure management. You can use it alongside tools like Kubernetes and Etcd to manage clusters of servers, along with software-defined networking tools like Flannel.

As an OS, CoreOS is significantly stripped down. You won’t find all of the tools and services that are packed into other Linuxes. Instead, there is only a basic userland with tools to allow remote management of container services. Everything else is deployed in containers as necessary -- including language runtimes. As all application features are decoupled from the OS, it’s possible to automate updates, handing loads off to other cluster members as hosts are updated directly from CoreOS’s own update servers.

With CoreOS, you can step back from managing your container hosts, and focus on the applications and services you’re running. Using tools like this means you’re able to concentrate on what’s important for your business, not on your hardware or your server OSes.

-- Simon Bisson

Etcd

Etcd

Modern cloud applications are moving away from traditional monolithic stacks to much more scalable architectures, building applications from clusters of microservices. It’s an approach that brings a new set of problems: How do we manage all of those services, maintain their configurations, and enable the discovery of new service instances as they’re deployed?

That’s where Etcd comes into play. A distributed key-value store designed to scale with your application, Etcd becomes the foundation of consistency and service reliability in a cluster of machines. You can use Etcd to hold configuration values for your services, either providing information that can be used to quickly configure new service instances or allowing services to register their details to simplify discovery. Applications can store data in keys arranged in directories, then watch either a single key or an entire directory for changes.

Etcd is fast, able to support thousands of writes per second per instance, using the Raft consensus algorithm to share logs between a leader instance and its followers. It has a simple HTTP API that can be addressed via common tools like Curl, so it’s easy to build into your own apps and services. You’ll also find Etcd used in popular projects like Kubernetes and Cloud Foundry.

-- Simon Bisson

Atomic

Atomic Host

Traditional server operating systems tend to be a poor fit for containerized applications. Big OSes that are bloated with features and services are anathema to microservices, which call for a simple, secure, and rapidly deployable host. A new generation of OSes aims to fix the situation, delivering lightweight host OSes that are intended to serve as a stable and consistent foundation for containers.

One option is Atomic Host -- based on Red Hat Enterprise Linux, CentOS, or Fedora -- with a defined product lifecycle that gives you a (relatively) immutable host OS that can be deployed across all of your cloud servers. Atomic Host has been stripped to the bare minimum, with only the components needed to run containerized apps. Any features you need beyond these will have to be built into your containers and managed using your build and deployment tooling.

Updates to Atomic Host are downloaded and deployed in one step, and they can be rolled back if they affect your current container builds. The OS has the option of using “super-privileged” containers to give applications access to the host, allowing you to deploy system management tooling alongside your application containers. Kernel namespaces and other security tools also improve container isolation, making sure code in one container won’t affect another.

Alongside the Atomic Host OS, Red Hat provides a secure registry for container images, with role-based controls and a web-based console, and tools to help package and distribute multicontainer apps.

-- Simon Bisson

Consul

Consul

In a world where microservices run dynamically across variable infrastructure, finding out where any given service currently resides gets trickier than hard-coded IPs. Services are scheduled and rescheduled across pools of machines, but your application still needs to know where its dependencies are.

Consul is a distributed system that keeps track of all of your services and exposes them through a simple REST API. Each service registers itself with Consul on startup, and Consul takes care of the rest. You can even configure health checks on services, which Consul will execute on a schedule you specify. That way, you can be sure that Consul will give your application the locations of only healthy services when you ask for them.

If you rely on external services that can’t self-register, you can simply use the REST API to catalog their known locations. This allows for a uniform application experience, regardless of who’s in control of the services in question. In addition to service discovery, Consul offers a flexible key-value store with optional locking and a mechanism for leader elections.

-- Jonathan Freeman

Vault

Vault

Vault, like Consul built by HashiCorp, is a tool for secrets management. Whether it’s API tokens, database credentials, or other sensitive info, Vault provides a simple mechanism for encrypting these secrets. While an encryption service for secrets is certainly important, Vault becomes even more interesting when you look at dynamic secrets. Dynamic secrets are secrets that don’t exist before they are used and are automatically expired, so they're substantially more secure than long-lived, easily shared passwords.

Vault can be configure to generate dynamic secrets for your applications on request, whether to provide access to a PostgreSQL instance or an S3 bucket. When the application requests credentials to a Postgres instance, Vault will create a brand-new user in the database and return those credentials to you. Each dynamic secret Vault creates is leased and expires automatically unless the lease is renewed.

Dynamic secrets paint a much prettier secure access picture than handing out static passwords to all of your applications. If passwords you must, Vault can be configured to persist encrypted static secrets to Consul or to disk. Or you can skip persistence entirely and expose Vault’s encryption APIs, giving your developers battle-hardened encryption as a service, so they don’t have to code it themselves.

-- Jonathan Freeman

Habitat

Habitat

Dizzied by the Pandora’s box you’ve opened trying to deploy Docker to production? Tired of having to treat legacy software deployments completely differently from your internal software projects? It might be time to check out Habitat, a new application management tool from Chef. Taking aim at last-mile deployment problems like configuration and application topologies, Habitat helps developers manage app-level complexities long before hitting production.

Habitat packages up your application with all of its dependencies alongside a supervisor process. That supervisor process then takes care of application startup, configuration, and service discovery. Supervisors gossip application state across the cluster, so you can easily roll out configuration changes and even overlay leader-follower topologies on top of existing applications. Habitat will package your legacy software in the same way, giving you a standard interface through which to interact with all of your applications. Habitat can export your application as a tarball, a Docker container, or an Application Container Image for running on rkt, so you’re covered no matter how you choose to run it.

-- Jonathan Freeman

Fluentd

Fluentd

One of the big challenges of distributed systems is managing lots of log files. You could spend time collating log files from each and every microservice instance, every container, and every VM in your cloud infrastructure, or you could implement a unified logging layer to collect, collate, and store that data for later analysis.

Fluentd is a commonly used data logging layer with a large and growing community of developers, as well as support from key cloud service providers, including Amazon and Microsoft. For Microsoft it’s a key component of the cross-platform Operations Management Suite, and it’s supported by the newly open-sourced PowerShell.

Once in place in your environment, Fluentd uses plugins and support for common logging platforms to extract data, filter it, and route the results to an appropriate storage or analysis tool. Data is restructured in JSON format, so it can be processed by endpoint APIs.

The result is a fast and powerful tool that can help with day-to-day operations, as well as providing formatted data that is used by more specialized systems for monitoring or security analysis. One big user is Line, which drops log data directly into Hadoop clusters to do real-time analysis of the messaging service that’s currently accessed by some 600 million users.

-- Simon Bisson

Prometheus

Prometheus

Managing cloud systems at scale also means monitoring at scale, collecting time series data across multiple systems and services. Enter Prometheus, a systems monitoring and alerting service that is part of the Cloud Native Computing Foundation.

The heart of Prometheus is a time series data model that uses key/value pairs to store data from the applications and services being monitored, with a custom query language for exploring collected data. You can implement multiple Prometheus nodes to monitor different aspects of a cloud service, as each node runs independently, delivering results to a graphical dashboard. There’s also an experimental tool for managing and delivering alerts.

Prometheus is perhaps best suited to monitoring microservices environments, where its ability to find monitoring targets through service discovery becomes an essential requirement. Prometheus can scrape metrics from instrumented code or receive them via a push gateway that can collect data from short-lived jobs, such as those running on a serverless compute platform like AWS Lambda or Azure Functions. While Prometheus offers instrumentation libraries that can be added to your code, there are also tools for exporting data from common cloud platforms including AWS and Mesos.

Other Prometheus tools work directly with JVMs, giving you the ability to instrument services like Cassandra without additional coding. Other services natively expose Prometheus metrics. For instance, Kubernetes and Etcd work directly with Prometheus.

-- Simon Bisson

Flynn

Flynn

For anyone running applications and services at scale, it’s hard to manage deployments and harder still to migrate applications from one provider to another. By blending deployment and operations tooling in a single code-driven framework, Flynn aims to reduce the cost and risk associated with building born-in-the-cloud applications.

Flynn is at bottom an open source PaaS that’s designed to host microservices and their data, whether that data is relational or NoSQL. Thanks to a modular, API-powered approach, it's easy to deploy and manage your applications, then integrate them into a build pipeline. Apps are built into containers, which are then deployed to host machines and VMs. Direct links to GitHub mean you’re able to quickly build new containers directly from code in Git repositories.

App components are backed by Flynn’s service discovery features, which deliver IP addresses to containers and use DNS for simplicity. Other elements of Flynn handle building and deploying high-availability services, as well as load-balancing connections across service clusters. Databases can be provisioned rapidly, and they are automatically backed up. It’s quick and easy to stand up a cluster for development and testing, or for running A/B testing on virtual infrastructures.

The team behind Flynn offer various support plans, though there’s always the option to take advantage of community support via GitHub and IRC. Flynn provides the ease developers crave with the scalability and reliability enterprises require. No wonder the community is growing.

-- Simon Bisson

Nginx

Nginx

The heart of the modern internet is the web server. The venerable Apache still sits at the top of the charts, but it’s being challenged by a faster and more versatile alternative: Nginx.

Fast, secure, and easy to configure and manage, Nginx is designed to work both as a web server and as a reverse proxy that can provide a load-balancing front end to web services. Out of the box comes support for standards like HTTP/2 and IPv6, as well as built-in web page compression. Additional features can be added through modules, which are available from Nginx’s own developers and from third parties.

Improving website performance was a goal from the start, and Nginx truly is built for speed. The underlying event-driven architecture means the server can handle a significant number of connections (10,000 plus) while using only a few megabytes of memory. That performance is a key enabler of high-density cloud services, where multiple containerized web servers can run on a single virtual machine.

Other performance enhancements include built-in media streaming for common formats. A commercial version, Nginx Plus, adds support and more security features, including a web application firewall.

-- Simon Bisson

Neo4j

Neo4j

The release of Neo4j 3.0 earlier this year was packed with speed, flexibility, and efficiency. The first detail to note is that enterprise customers have the opportunity to choose to use a new storage driver that removes the address space limit of 34 billion nodes. Chances are this doesn’t apply to you, but if you need to plan on massive growth and are questioning Neo4j as a viable option, this should calm some of your concerns.

You’ll also have the option to use the new Bolt binary protocol instead of the HTTP API to squeeze every last drop of efficiency out of your network. From an operations standpoint, you’ll be relieved to hear that you no longer have to juggle a handful of properties files. As of this release, all properties files have been consolidated into a single location to enable easier management of deployments, whether they’re in your datacenter, in the cloud, or running locally for development. There’s even more packed into this release, so check out the release notes and grab a copy to play with.

-- Jonathan Freeman

Ubuntu

Ubuntu Server

Ubuntu is best known as a desktop Linux distribution, but of course it’s also available as a server, with a strong emphasis on building and running private clouds. To foster enterprise adoption, Ubuntu Server is available in a Long Term Support release and in versions for all major server architectures including ARM64. It is also certified as a guest OS on all the major public clouds.

If you’re thinking of building an OpenStack private cloud, Ubuntu Server is on the short list of OSes to consider, as it includes support for the LXD container hypervisor and the JuJu release of OpenStack, which can automatically deploy in LXD containers. LXD is perhaps best thought of as an open version of Microsoft’s Hyper-V Containers, mixing container support with hypervisor-like isolation. You’re not limited to using LXC containers on LXD, as there’s also support for Docker.

Other key features in Ubuntu Server that buoy cloud deployments include newly added support for the reliable ZFS file system, as well as Data Plane Development Kit (DPDK), a kernel networking technology that allows applications to work directly with the network data plane, to access and manipulate packets directly. While DPDK isn’t a complete networking stack, it’s able to simplify and accelerate key network functions, which is important when building high-density cloud services.

-- Simon Bisson

PowerShell

PowerShell

Microsoft’s PowerShell is the latest entrant on our winners' list, though it's been around for a long time. That’s because it recently made the transition from proprietary technology to open source, along with a set of tools for building and managing your PowerShell scripts.

Designed as a scripting language for system administration, PowerShell is a framework for automating actions against system-level services, wrapping service APIs and files in “cmdlets” that run from a familiar command line. Need to deploy a new Docker image for a web service? If you’ve built an appropriate PowerShell cmdlet, all you need is a remote endpoint, a few variables, and an image to deploy -- it'll be running across a set of front-end servers in an instant. That set of cmdlets can also be managed from tools like Chef and Jenkins, giving you a host for automated operations.

Don’t mistake PowerShell for an alternative to Bash or other Unix shells. It’s a powerful tool in its own right, but its focus is on operations tasks. We should expect PowerShell to continue to cater to Windows Server and Microsoft’s own platforms. But by extending PowerShell to Linux and Unix, and by adding SSH to PowerShell’s remoting capabilities, Microsoft has laid the groundwork for it to become a tool that can help manage heterogeneous applications at cloud scale.

Microsoft intends to use PowerShell alongside the Fluentd logging engine as the company grows its cloud-based service management platform over the next few years.

-- Simon Bisson

GitLab

GitLab

You can’t run a modern development operation without a distributed version control system. Open source tools like Git and Mercurial partially fill the need, but by themselves these tools restrict you to command-line interactions. This is where hosted tools like GitHub and Bitbucket come in, but both are closed source and their road maps have been somewhat opaque in the past. GitLab is an open source alternative to GitHub and Bitbucket with compelling features, an aggressive development cycle, and an exciting road map.

GitLab isn’t merely an open source version that stops at standard features like browsing code, reviewing merge requests, and submitting issues. It has support for confidential issues, for when you want to submit a sensitive or security-related issue to an open source project. It allows you to subscribe to an issue label, so you’ll get a notification any time the label is added to an issue. Need a configurable Kanban-style issue tracker? GitLab has you covered. It even has continuous integration out of the box and can deploy directly to Kubernetes.

-- Jonathan Freeman