Farewell, Frodo: Why we need trustworthy metadata to define our infrastructure

Applying metadata allows you to set policy for areas like security and performance based on precise workload characteristics, even at the scale of the cloud

cloud computing - smart city - data - network connections - binary rain
Thinkstock

I have fond memories of when IT admins would give names to their servers, and even assign them unique personalities. It was a simpler time, before virtualization and the cloud made infrastructure exponentially more complex. My favorites were the The Lord of the Rings fans who named their servers after hobbits and elves; anything related to Bilbo and Frodo might be a web server, while Arwen and Legolas were databases. It was a fun way for IT admins to express themselves, but you had to know your Tolkien to understand the role a server might play.

This model worked for small environments but doesn’t scale in the era of virtualization and the cloud. There are simply too many things to name, and you’re forced to pull out esoteric Tolkien references that only a supergeek would know.

With the cloud, and specifically IaaS, many of us have abandoned the notion of naming or even uniquely identifying servers. It’s impossible at this scale, where workloads are spun up and torn down by the hundreds. With a service like AWS, we use machine-generated IDs that have little significance for people.

Herding cattle

But in the process of giving up on names, we’ve lost some of our ability to manage and control our infrastructure. If servers before were treated like pets, managing this new environment is akin to herding cattle. We still need a way to classify servers, both physical and virtual, as well as the workloads and services that run on top of them. Only in this way can we manage our infrastructure in the most efficient, effective way.

To achieve this, enterprises should adopt a labelling system based on metadata. Labels allow you to abstract the underlying infrastructure and manage your workloads as classes, with classes defined by the characteristics you choose to apply. For example, a class might be defined by OS version, workload type, business unit or region in which it is hosted. These classes make it much easier to apply policies to workloads at scale, from provisioning to new ways of applying security.

Metadata is best suited for this because it frees you from the limitations of infrastructure constructs like IP addresses, which can change over time and become obsolete. And you can apply metadata labels that are understandable and meaningful to people.

Getting to metadata

This metadata can come from a source of truth like a ServiceNow or BMC CMDB, an orchestration tool like Puppet or Chef, or a spreadsheet. It can be derived from multiple sources and combined to define a policy in multiple dimensions.

For example, you could apply one policy for workloads that are part of a production application running on an older OS, and a different policy for workloads in the same application but on a newer OS. Or, a database workload that’s part of an e-commerce application might be considered high-value due to the sensitivity of the data, so it would be assigned a highly restrictive policy.

The more sources of data you use—both static and dynamic, from the infrastructure and the application itself—the richer the metadata namespace you can create, and the more expressive you can be in developing actions and policies.

The importance of trust

These metadata labels aren’t merely for identification; they define operational outcomes in critical areas like security, performance and stability. If you’re going to use this data, therefore, it needs to be trustworthy, accurate and current, and only trusted admins should have access to edit and maintain it. As infrastructure evolves over time, the metadata will need to be updated to reflect the current state or else policy could be wrong and have significant negative consequences.

You must also be disciplined in how you create these metadata labels, or they will become overwhelming to those who use them. Success depends on a structured, ordered approach, and conveying that structure clearly to all stakeholders. In this way, you can create and maintain a labelling system that scales sufficiently for the cloud era, and that provides you with maximum security and control for your infrastructure. And that’s something a Lord of the Rings fan would be proud of.

This article is published as part of the IDG Contributor Network. Want to Join?