Perhaps United Airlines Needs to Look at Their IT Architecture

Sorry about the lack of blogging this week. I've been on a whirlwind tour of Rochester, NY and Ottawa, ON. I always thought myself to have good travel karma, but that all ended on the way back from Ottawa, when I experienced firsthand the problems with air travel today. Specifically, a computer outage at United Airlines - Ottawa that left travelers, let's just say, a bit on edge. However, it leads me to do some

Sorry about the lack of blogging this week. I've been on a whirlwind tour of Rochester, NY and Ottawa, ON. I always thought myself to have good travel karma, but that all ended on the way back from Ottawa, when I experienced firsthand the problems with air travel today. Specifically, a computer outage at United Airlines - Ottawa that left travelers, let's just say, a bit on edge. However, it leads me to do some research and ask some questions…perhaps making United a good example of how IT architecture can ruin your business.

First of all, I'm not going to get into another big circle and kick the airlines. Many are already doing that these days, as delayed and cancelled flights spin out of control. Having to fly for business, as I do, I can tell you that it's about as much fun as a root canal, between the long lines in security and rude airline employees, as well as cancelled and delayed flights. But, I'm digressing.

So, upon entering the terminal in Ottawa I noticed that the line for the small United Express operation there stretched all the way to the entrance doors. Looking closer, I saw that the agents were checking people in using paper and pen. Long story short, they had no idea who had a ticket and who did not, they overbooked the flight, boarded the flight, and after figuring out they had too many passengers for the plane, kicked 3 or 4 off, including yours truly. Not sure why. However, being a million mile flyer on United, perhaps they figured I had flown enough…but…they can't figure that out can they?…the computers were down.

In doing some research when I got home, I found that this was not the first instance of outages at United. Indeed, in June 2007, United had a 2 hours outage that halted flights and frustrated passengers. Not sure how extensive the outage was yesterday, or what happened, or even how their systems are configured, I just know the damn thing did not work and as a customer I had to pay the price. However, so did the airlines. If they think this does not send loyal customers packing, they are greatly mistaken.

So, what does this have to do with SOA? Everything. What's key here is that no matter what caused the problem, the fault exists with the IT architecture. The fact of the matter is that systems should be, and need to be, setup to work around outages. In essence, creating layers services, and mechanisms to leverage those services in different ways as the need arises. In United's case, how about a Web-delivered system for use as a backup, accessing the same services as the core systems, and making those services both virtual and redundant. Thus, there is no single dependency built into the architecture, and therefore outages are dealt with as mere interface changes. In many instances, outages could be completely transparent to the user. I used this approach when building banking systems back in the 1990s. This approach is even easier today, with the service virtualization and management technology we have available.

Again, I'm not a United insider, so I have no idea what the heck happened, I'm just using United as an example of how IT architecture can get you into trouble. I would provide a general advisory that they take the following steps:

  1. Break the architecture down to its functional level, and make sure you have a semantic-, service-, and process-level understanding before proceeding.
  2. Divide the data and data services up into virtual redundant domains, making sure they are housed at different locations and leverage different networks. Make sure to leverage a virtual data and service management tool to handle automatic cutovers and services utilization.
  3. Same deal with the transactional services; divide them up into virtual redundant domains that are location independent. Again, they must be managed.
  4. Bind the data and transactional services to a primary user interface, and then to a secondary user interface. Perhaps Win32 and Web delivered, depending on the types of clients utilized.
  5. Test the damn thing. Make sure there are no single points of failure.

Of course, the argument could be around the affordability of this SOA solution, but I would say it may pay for itself in a short period of time just by avoiding outages. You guys have enough to worry about with the weather, safety, and the normal hassles when operating an airline. Take this one off the list.

Copyright © 2007 IDG Communications, Inc.