Keep on running

As our businesses become ever more digital, the cost of poor availability can be catastrophic

global time zones with 4 clocks
Thinkstock

Availability is part of digital transformation, a friend of mine at Veeam told me about a year ago. “Seriously? Backup and recovery plays a part in your businesses move to a data-driven business?”

Let me share a story with you.

A couple of weeks ago I was heading off to my local megacineplex to watch a film. I fired open the app on my phone to find no films were scheduled for that day. Odd. I tried the website—only to be greeted with the same message.

My wife and I decided we’d go to the cinema to see if it had been closed by some sort of zombie apocalypse. No, the cinema was open, but as we entered, we noticed the ticket machines not running, the signs over the screen doors were blank, and the tills unavailable. It was cash-only, with staff writing out your ticket by hand on the back of till receipts.

The problem, we were told, was that routine maintenance on the computer systems had failed and the systems could not be brought back online. And although the manual system came to the rescue, the inability to book online no doubt cost that particular cinema a whole lotta money.

The situation at my local cinema that day (in fact, through that whole weekend) reminded me of the message I heard from my friend at Veeam. As the cinema had made its business more digital, this lack of system availability had impeded its ability to trade normally for an entire weekend. Luckily, a level of manual failover and, of course, its most critical system, the projection system, was unaffected, which meant it could continue to trade, even if at a reduced capacity.

As we move our businesses to be ever more reliant on technology and data, this does underline clearly that availability is indeed crucial. I think that the technology industry has and still continues to forget this.

However, even when we do understand the criticality of backing up our data, often our thinking can be flawed. As our businesses become more technology-dependent, thinking about backup and quick recovery is no longer sufficient. The focus has to be constant availability of these systems. And it really doesn’t matter where these systems reside. Be they in your datacenter or a public cloud, designing continual availability into all our systems is a crucial part of a modern strategy.

How do you go about designing system availability?

You must start by fully understanding two things, the primary activity of your business and how your IT systems support that activity.

Assess

Once those activities are identified, it’s important to understand the systems that support those activities … and not just the obvious ones. What about those systems that—often called Tier 0 services—that are necessary for your infrastructure to run, such as domain controllers, DNS servers, and time servers. I’ve seen many systems fail to recover because ancillary systems were never considered as part of the availability plan.

How available do you need to be?

Recovery point and recovery time objectives remain staple parts of building an availability strategy, and they are as important in a modern strategy as ever. You need to understand in the event of a failure how long can your system be down, and how much data you are prepared to lose.

Business buy-in is critical when it comes to defining this. Your business needs to understand the importance of availability and the impact of the loss of systems and service, the damage to the bottom line, customer relations, trust and reputation, and so on. Without that buy-in, the next stages of availability design can become very difficult.

Understand what can go wrong

Understand the risk, what can go wrong, how will that affect your systems and business. And then understand the appetite for tackling this risk. If the impact of the outage is low, don’t overarchitect your solution. But if it has the potential to be catastrophic, design and budget accordingly.

Design for availability

Now that you understand the key systems, risks, availability requirements, and business appetite to address the problem, you can start to design your highly available systems.

Understand at what layer you need to build availability. Is it hardware, infrastructure, software, or application? And what are you protecting against, from hardware component to datacenter? With that information, you can design a solution that delivers the availability to meet your business needs.

Test the heck out of it

If you’re going to go to the trouble of building available systems, test, test, and test again. The importance of testing your system availability cannot be underestimated. It’s not just giving you (or the business) confidence in those systems. Your team will gain confidence in knowing it can deliver availability when issues occur.

As our businesses become ever more digital, the more my friend at Veeam is correct—availability is not just a part of digital transformation, it is a crucial part, as the cost of poor availability can be catastrophic.

With that in mind, we all need to ask ourselves are we making sure we are doing what we need to keep on running?

This article is published as part of the IDG Contributor Network. Want to Join?