We've all been there. It's midnight, the maintenance window for work on a few remote systems is quickly closing, and some critical infrastructure component is down hard. It might be a router, a switch, a storage array, or a server -- it doesn't really matter. There's trouble in River City.
To the casual observer, it might seem that the hardest part of this scenario is finding and fixing the actual problem causing the outage. But in many cases, that's the least of it. The hard part is trying to do something absurdly simple -- such as properly generating a break sequence during the boot cycle.
A good example of this is a recent situation I fell into when a Sun Oracle storage array was throwing critical errors on boot following an update. The serial console was rife with error messages and prompts for further action. Each option was in the form of ESC-2, ESC-4, and so on. That was all well and good when I was SSH'd into the serial console server and driving the console directly, but when I connected to the service processor directly through SSH using the Sun Oracle Java-based shared shell applet, there was no way to generate the required Escape key. I wound up switching the console between the shared SSH session and my serial connection just to hit the Escape key. It wasn't just for this function -- try using vi without a functional Escape key.
That's only a minor example. I can recall spending the better part of 30 minutes trying to get into the BIOS of a server that had a criminally short wait for a key combination during the POST. Working through an IP KVM to a remote site with a bunch of lag and lacking the ability to generate the key combination natively, I had to use the KVM's macro functions to try and hit it just right, boot after boot after boot. Once I got in, fixing the actual problem on that box took only about 30 seconds.
I know we've all dealt with mystery mouse cursor motions with remote consoles. I recall a cheer going up from onlooking admins when a spastic cursor finally, miraculously landed on the right button in a dialog box on a server that was 2,000 miles away -- after at least 25 minutes of trying. And no, there was no corresponding hotkey for that function. Situations like that really make you want to hunt down the developers responsible and whack them with a sack of doorknobs.
We've all been in a situation where the solution to a major problem seems impossible. That's frustrating enough. But to have the solution in hand, ready to deploy -- yet inaccessible due to some minor, piddling keyboard translation issue -- that's a special circle of IT admin hell.
Just as enraging are the flashy elements in some BIOSes that add to the misery. When working through remote KVMs, even through links with sufficient bandwidth, spiffy ASCII animations in BIOS screens are the work of Satan. That little scrolling bar at the top of some BIOS screens does nothing -- literally. Yet it makes navigating through the BIOS terrifically unpleasant if you're not sitting in front of the server with a monitor and keyboard due to the excessive lag introduced by the console server trying to draw the animation.
Whenever I find myself in a situation like this, I ponder the world we live in. On the one hand, we have the handy functionality of full-on GUI remote consoles; on the other, those consoles are completely hamstrung. It's almost as if no one thought about remote management when they coded these tools. In fact, I'm sure no one did.
Unfortunately, there's no way to fix these problems -- other than to hope that those who inadvertently caused them will smarten up and release new code with fixes. You can't even really test these scenarios, since they seem to pop out of nowhere and may well arise from a tiny bug in a firmware update of a device or console server. But the tiny nature of such problems belies their massive ability to lengthen problem resolution and damage the mental health of the admin currently dealing with them.
So the next time you suddenly find out that the only serial terminal app you have handy can't generate a break sequence on a router, or you find it impossible to issue a Stop-A on an Sun Oracle box, or your mouse gets pegged to the upper-left corner of the screen and refuses to move other than to swoop down and right-click on the Start menu every few seconds, remember to breathe slowly, count to 10, clear your mind, and start collecting doorknobs.
This story, "Dealing with the system console from hell" was originally published at InfoWorld.com. Read more of Paul Venezia's The Deep End blog at InfoWorld.com. For the latest business technology news, follow InfoWorld.com on Twitter.