Microsoft's handling of BPOS outage an ill omen for Office 365

Once again, Microsoft's BPOS service took an extended hit last week, and Microsoft and its fabulous Dashboards didn't bother to tell anybody

Last month I posted a preview of the Office 365 beta, urging you to give it a try. Now I'm having second -- no, make that third -- thoughts.

Office 365 is a slightly improved, and much more rabidly marketed, version of BPOS, Microsoft's current Business Productivity Online Suite. Some of the companies in North America that pay for BPOS got hit with extensive email outages last week. Microsoft didn't bother 'fessing up about the problems until days later. Here's what Microsoft VP Dave Thompson wrote on Thursday evening:

On Tuesday at 9:30am PDT, the BPOS-S Exchange service experienced an issue with one of the hub components due to malformed email traffic on the service... The delays encountered by customers varied, on the order of 6-9 hours... At 9:10am PDT today, service monitoring again detected malformed email traffic on the service. The problem was resolved at 10:03am, but users experienced up to 45 minute email delays during this time. A second, but related issue was detected via monitoring at 11:35am PDT, resulting in email stuck in some end users' outboxes. The issue was remediated at 12:04pm PDT. During this time, more than 1.5 million messages had queued on the service awaiting delivery. The backlog was 90% clear by 4:12 PM, but because of this large backlog of email, customers may have experienced delays of as long as 3 hours... In an unrelated incident, starting at 1:04am PDT, service monitoring detected a failure in the Domain Name Service (DNS) hosting the domain... The team... restored service at 4:52am PDT.

If I count correctly, BPOS mail service went down on Tuesday, with some customers out for six to nine hours. It went down again at 9 a.m. on Thursday and came back up at 10 a.m. with a 45-minute additional delay. It went down again at 11:30 on Thursday, was out for half an hour, but in that time, 1.5 million messages got backed up, and they still hadn't all cleared by 4 p.m. Thursday -- not a good week for BPOS email.

Office 365 beta testers, like you and me, didn't get hit. Nope. The ones that went without email are paying BPOS customers.

Don't get me wrong. An occasional email outage is par for the course with Exchange Server. Those of you with in-house Exchange Servers that stay up 24/7 with 100 percent availability can raise your hands right now and I'll defer to you. But most of the world has grown accustomed to random outages. I'd chortle a little bit at the fact that Microsoft itself can't keep its servers up, with three significant outages in three days.

That doesn't get my knickers in a knot. What bothers me is the cavalier way Microsoft's treating its paying customers. Again.

Nine months ago, on Aug. 23 and again on Sept. 3 and Sept. 7, Microsoft BPOS customers went through a similarly hellacious streak of outages, with no notifications from their, ahem, service provider about the source of the problems or when the problem would be fixed. As a result, on Sept. 27, 2010, Microsoft launched a new service called the Microsoft Online Service Health Dashboard. Three different Dashboards, actually -- one each for North America, Europe/Middle East/Africa, and Asia-Pacific -- which provide BPOS customers with up-to-the-minute information about their servers. "It is designed to provide a greater level of information regarding the status of all services and tools, and it includes information about current service status," Microsoft said.

Where were those fabulous Dashboards when the BPOS offal hit the fan this time? Many Microsoft customers are wondering the same thing.

Unfortunately, Microsoft doesn't let normal people look at its Dashboards. Only card-carrying, password-weilding customers are allowed in, unlike Google, which has nothing to hide. But there are a couple of snapshots of BPOS Dashboards on the MOS TechCenter forum, which includes hundreds of messages from frustrated users. For example, poster sanchitosonria says, "MS has generally been horrible in updating the MS Health Dashboard with info and they consistently display inaccurate levels of interruption...This issue is not a performance degradation by any means. Mail does not work."

The first detailed Dashboard notification I can find on the TechCenter forum is timestamped 9:40 a.m. on Thursday. That's two full days after the original notice. Dave Thompson notified the world about the problems, via his blog, at 6 p.m. But there were two full days of widespread intermittent email outages without any explanation from Microsoft. Yes, there were "service degradation" icons on the Dashboard earlier, but no explanations or ETA for a fix.

Call me naive, but what's the point in having a notification board when it takes two days to post notifications?

This article, "Microsoft's handling of BPOS outage an ill omen for Office 365," was originally published at Get the first word on what the important tech news really means with the InfoWorld Tech Watch blog. For the latest business technology news, follow on Twitter.

Copyright © 2011 IDG Communications, Inc.

How to choose a low-code development platform