The trouble with IT mysteries is that no one outside of IT understands the clues, the stakes, or the feeling of triumph when we finally track down a devilishly intractable problem. I guess we'll have to get by on the knowledge of a job well done.
I'm head of IT for a small business, and we recently installed two new multifunction devices to replace our two workhorse copiers and virtually all our printers. They're almost identical models with the same capabilities, user interface, and configuration. One would support color printing, and the other would handle jobs requiring higher capacities and speeds.
[ Unsanctioned devices, compromised networks, downtime -- today's IT is all about embracing imperfections. | Follow InfoWorld's Off the Record on Twitter for tech's war stories, career takes, and off-the-wall news. | Subscribe to the InfoWorld Off the Record newsletter for your weekly dose of workplace shenanigans. ]
They both went in on the same day: "Color Printer" first, with the crew waiting until we were satisfied before we'd let them start installing "Bulk Printer."
Color Printer had problems but nothing earth-shattering. The MAC reservation we'd set up for it didn't seem to take, though Windows was characteristically vague about why, so we had to give it a static IP address.
Also, access to its Web interface was spotty. Sometimes it asked for authorization in the page. Other times it popped up a dialog populated with a meaningless string of letters and numbers -- and never worked. And sometimes it just timed out. Then it would work again. In addition, some print jobs seemed to vanish into the ether, which we chalked up to a potential error on our part in configuring the queues.
The crew had another job to get to, so we let them install Bulk Printer. After all, Color Printer worked most of the time in our testing and we'd surely be able to iron out the wrinkles.
Bulk Printer worked perfectly, but Color Printer kept having problems, though most print jobs went through. First, its Web interface issues continued. Second, when jobs didn't go through, there was conflicting information: The actual printer didn't turn up anything in its logs, but the print queue server's logs showed the jobs had been received and handled properly. Attempting to poll Color Printer for its device options did no good for 20 minutes of reboots and reinstalls, then mysteriously worked, though no settings had changed.
Despite our efforts, over the next two days nothing was resolved and our panic increased as more jobs vanished without a trace, some of which were quite hard to re-create.
We swapped network cables and ports, even doing a direct run to the network hub with a long Ethernet cable. We spent hours poring over the network configuration of both printers trying to find a difference and tweaking settings to see if they'd matter. We uninstalled and reinstalled print drivers and forced out settings changes in group policy time and again.
We had people stare at both the queue and the printer while we ran dozens of jobs in quick succession, trying to find a pattern to the failures. We tried print jobs off-hours when the office was empty and the network quiet; if anything, it lost more jobs then. We tried printing from different versions of print drivers or routing around the print server directly to the printer.
Nothing changed the outcome or gave any clues we recognized.
On the drive home one night, I had an idle idea. We'd changed out everything in our testing except the IP address. It shouldn't be the problem: We hadn't used that address for at least a decade -- but what if something else was using it? Or had been flooding it with pings, looking for a long-lost server? It would be easy to test.
When we had time the next day, we changed Color Printer's IP address. Since the problems had always been intermittent, there was nothing we could do but wait and hope.