We all make mistakes. But when you work in IT, those errors can quickly go public.
If you've never emailed profanities to 2,500 of your customers or sent a love note to colleagues at a dozen companies, consider yourself lucky.
[ For more real-world tales of brain fail, see "Stupid user tricks 6: IT idiocy loves company." | Find out which of our eight classic IT personality types best suit your temperament by taking the InfoWorld IT personality type quiz. | Get a $50 American Express gift cheque if we publish your tech tale from the trenches. Send it to email@example.com. ]
You might have deleted the entire contents of a server by accident or wiped out three months' worth of government agency data without a reliable backup on hand. You may have pulled a youthful prank that cut off Net access for thousands of your employer's customers. You could have deliberately shared your log-ons with everyone else in the company in order to make a point or unplugged network servers willy-nilly, just to see what would happen.
And when caught in a mistake, you might have invented an explanation so ingenious that it earned you a bonus at the end of the year.
Here are seven true tales of IT screwups. Some names have been changed to protect the guilty. Don't laugh. One day, you may find yourself in eerily similar circumstances.
It's the classic oops moment. Compose an email, hit Send, then realize you've made a horrible mistake by sending it to the wrong person.
It happened to Joel Postman, a senior communications executive for a major networking company. Back in '99, he was in market development for Sun Microsystems when he composed a romantic email to his girlfriend -- then promptly sent it to the vice presidents of strategic alliances at a handful of companies.
"My hand slipped on the mouse button, and I chose the wrong email list," he admits. "One of them replied, 'We love you, too, Joel, but we weren't looking for that level of service.' It's one of those mistakes you usually only do once."
Sometimes, though, a screwup like that can help by demonstrating to the world there's a human being on the other side of the screen. That's what happened to Alex Schiff and Chase Lee, co-founders of Fetchnotes, a cloud-based notepad for recording small scraps of unstructured information.
Last January they were getting ready to announce the company's public launch. First, though, Lee decided to send a test email to his partner to make sure the formatting looked right. So he wrote an email -- "This is my test, bitches" -- and hit Send.
But Lee didn't just send that message to Schiff; he sent it to all 2,500 users of the Fetchnotes private beta. When they realized what had happened, Schiff says he began screaming at his computer.
"I thought we'd just screwed the company," he says. "We got 100 responses within the first five minutes. But when we looked at them, 99 percent said our email had made their day. My favorite ones were from users who said they had signed up for Fetchnotes and then forgot about it, but my email made them take another look."
Schiff immediately sent an apology to every user. This being 2012, he also blogged about the incident, drawing 25,000 hits and even more responses. Within a week, Fetchnotes had 500 new users, and the amount of activity on the site more than tripled.
Only two people canceled their accounts because of the profanity, he adds.
"I would never tell anyone that swearing at your users is a good idea," he says. "But having a conversational tone with our users is the right thing for a company at our stage. People appreciate the realness, if not necessarily the swearing."
Along with the oops email is another classic tech mistake nearly all of us have made at one time or another: the unintentional mass delete.
About five years ago Paul Unterberg, product manager for a financial services technology company, volunteered to help out a friend who ran a Web forum about electronic music. The forum was running out of disk space because his friend had configured his MySQL backup incorrectly. At the time Unterberg was working as a database administrator for Microsoft SQL Server, but figured he was familiar enough with MySQL to give it a go.
"I fixed the MySQL backup job easily enough, but he still had a ton of complete DB backups in a folder with several subfolders," he says. "My Unix was a bit rusty, and I knew I had to delete a whole folder, so I looked up the syntax on the RM command."
It seemed simple enough, so Unterberg confidently typed a command into his friend's server terminal:
rm -rf /
Then he watched in horror as every file on the server got recursively deleted.
"The good news was that his disk space problem was solved," he jokes. "That was definitely the dumbest thing I've ever done with a tech system."
Fortunately, the forum's Web hosting service had a recent backup on hand. The forum was only down for about three hours for "database maintenance," says Unterberg.
Apparently there are no hard feelings because Unterberg took over as sys admin for the forum until it shut down four years later. Lessons learned?
"Never run a command you don't fully understand at the root," he says. "And if you're doing something you don't really understand, read the documentation over a few times until you do."
Not all mistakes are intentional. Sometimes to get things cooking you have to stir the pot.
Just ask Steve Silberberg, owner of Fitpacking/Fatpacking, a company that promotes weight loss through outdoor adventure. Back in the mid-'90s, Silberberg was working as a software developer for an asset management firm with just over 40 employees, all of whom were terrorized by a system administrator who was a security-obsessed control freak.
The problem? Almost nothing the sys admin did actually made anything more secure, says Silberberg. It just made things more difficult.
"Anyone could walk around the office and access your account or go into the unlocked computer room and get root access from any number of terminals," he says. "Still, this guy was relentless at blocking people from doing what they needed to do until he gave them permission."
In protest, Silberberg emailed his user name and password to everyone in the office.
The first thing that happened was Silberberg's account got shut down and he was asked to pick a new password. Then he and everyone else had to attend a lot of meetings dedicated to security issues. But he says he was never disciplined and continued to work there for another 10 years.
"People in the office were pretty split about it," he says. "A lot of them came up to me and said that was just what we needed. I think most of them saw how frustrated I was. It's difficult to get things done when you don't have access to the things you need, especially when you're working at a small company that needs the ability to make changes quickly."
Still, Silberberg says he wouldn't do the same thing again today. "If you give up your password in a public forum these days it's a bit different than doing it when you're all running off one Unix box that's not connected to the rest of the world."
Then there are those youthful mistakes that are neither accidental nor intended to be helpful. The guilty party in this story doesn't want his name used, so let's call him Jason Bourne.
In the mid-'90s, Bourne worked for a national Internet service provider. His job was to maintain the banks of modems that the ISP's customers would dial into to obtain Internet access. The modem banks were installed in racks facing a large window in a room attached to operations and tech support. Each time someone connected, the lights on the modem would blink green.
"Each little green light represented a connected customer," he says. "Executives would walk VIPs past the window so that they could see all the blinking lights and be thoroughly impressed."
To stave off boredom, the operations and support teams would dare each other to do things, says Bourne, who is now IT director at a Web startup. One of their favorite pastimes was called "wiping the wall."
"All of the modem banks were interconnected and managed from a single console, so you could issue commands to each bank all at once," he says. "Wiping the wall was sending a disconnect command to all modem banks simultaneously."
The ops and support teams would then watch the green lights go dark, starting at the top-left corner of the modem bank and finishing at the bottom right -- disconnecting nearly 10,000 customers in a large metro area.
Slowly the wall would light up again as customers redialed, and everyone would have a good laugh, says Bourne -- except, presumably, the customers who got cut off.
"Back in the '90s, Internet connections were notoriously unreliable, and random disconnects were common and expected," he adds. "But I suspect few people realized how frequently those disconnects were done on purpose for the amusement of a bunch of dumb teenagers."
When Chris Barbin was senior VP at Borland in the mid-2000s, he was asked to help integrate the IT department of a software testing company Borland had just acquired.
"I told my boss at the time that our data centers were a big mess," he says. "He said, 'If you're going to complain about IT, you're going to have to fix it.' That's how I became CIO."
Barbin started as CIO by commissioning a survey of Borland's IT assets. He discovered that the company had some 1,100 servers in its eight primary data centers -- or more servers than employees. Worse, some 200 of them came back as "unknown," meaning that nobody had any idea what they were supposed to do.
Barbin told his data center employees to unplug them. He figured if something important suddenly stopped working, the company's help desk would hear about it soon enough.
"If you've got 200 servers and no one has any idea what they do, odds are they're not running a mission-critical system," he says. "When we pulled the plug literally no one noticed."
The defunct machines ended up stacked in "the server cemetery," a stairwell between the 7th and 8th floors of Borland's Cupertino headquarters, until Barbin could figure out what to do with them.
The server purge was only the beginning of a long process of rationalizing Borland's IT assets, which Barbin achieved largely by moving into cloud-based services like Salesforce.com. About a year later, he and three colleagues founded Appirio, a solutions provider that helps medium-sized and large enterprises migrate their IT infrastructure to the cloud and manage it.
Barbin says what he found at Borland was not all that unusual. Even today, large enterprises may own thousands of underutilized servers and not even be aware of it. However, he adds he would not do the same thing today -- because he can't.
"We don't own a single server," he says. "We are 100 percent in the cloud and committed to the server-less enterprise. We started with four employees, now have nearly 500, and hopefully one day we'll have 25,000, but we will never own a server."
Also on the list of classic screwups: backup tapes nobody ever looked at until they needed them -- and then it was too late.
Back in the mid-'90s, Mike Meikle was working as a jack-of-all-trades system administrator for an unnamed state agency. He soon discovered that the agency's servers had never been scanned for viruses and its antivirus software was badly in need of an update.
"It was at the dawn of my career," he says. "I had only been there a few weeks and was still trying to get the lay of the land. I worked with a gentleman who'd been there five years and was in charge of server backups. He assured me all the servers had been backed up and we were good to go."
Meikle began scanning for viruses, removing infections, and restarting the machines. But one server refused to reboot -- the one that tracked where grant money was coming from and where it was going to.
"I figured, no problem, we'll just restore it from tape," he says.
Naturally, nearly all the tapes were corrupted. But nobody knew that, because no one had ever bothered to test the backups or attempt a restore. The dead server was finally revived, but months of data were gone for good.
Meikle, a self-titled corporate consigliere who runs his own tech consulting business, says he still remembers the look on the CFO's face when he informed her that her staff would have to redo three months' worth of work because they had no recent viable backups.
"That's the day I learned the importance of having reliable backups and testing them," he says. "Sadly I had to learn the hard way. Even sadder, many IT shops today still don't have good backups that have been tested or even disaster recovery plans."
There are three essential rules of programming. Rule one: Code carefully. Rule two: Test thoroughly. When those two aren't enough, be sure to follow rule three: Cover your assets.