The (infrastructure) fix is in

How do you solve a problem like The HHS department's progress report offers maddening clues

Two months ago, on Oct. 1, the U.S. government rolled out, where citizens could explore new health care options and apply for new health care insurance. While states could set up their own exchanges, this site was the central point in the plan -- and it failed miserably. Delays, disruptions, errors, and complete unreachability dogged the site in the first days -- then stretched into weeks. Hearings were called in Washington. There were demands that Health and Human Services Secretary Kathleen Sebelius resign due to what seemed to be an abject failure of a much-ballyhooed and definitely expensive central cog in the new Affordable Care Act health care law, aka Obamacare.

Then something extremely interesting happened: The site began recovering. Two months after it launched, the site is now performing pretty well. This is a very curious development indeed.

[ Also on InfoWorld: The inside scoop on how is getting fixed | For a quick, smart take on the news you'll be talking about, check out InfoWorld TechBrief -- subscribe today. ]

Development of the site cost anywhere between $70 million and $150 million, depending on how you tally the numbers and whom you ask. These are massive figures, even for a site with the scope of this one. While I don't know the details regarding what was necessary to produce the site from an internal perspective, I can say that if the amount spent on it was anywhere between either of those figures, the site should have been bulletproof right out of the gate. The fact that it wasn't should cost the contractor in charge to lose government jobs forever more. I doubt that will happen.

Revving up the horsepower

But the question remains: How did the site's performance improve so quickly? Sure, there were some major flaws in the way the application was designed, but there's no way you could fix all that code within weeks. You can make improvements and tweaks, but nobody's rewriting the whole thing in that amount of time, no matter how many developers you bring in. As the saying goes, nine pregnant women don't produce a baby in one month.

I'm guessing the code was workable for the most part, and the infrastructure was woefully underspec'd and underbuilt. According to an HHS blog post and a progress report, a number of software and hardware fixes were put into place to make this happen. Most striking about this data are the declarations that they "installed dedicated hardware" for the registration database that led to a threefold improvement in speed. The agency also deployed 12 database servers and new storage for the core database, along with a new firewall for a 500 percent performance increase.

You mean it was possible that a new firewall increased performance by a factor of five? Was the previous firewall a decade old? And there weren't already 12 database servers?

1 2 Page 1