October 23, 2009

Have hard drives become less reliable?

One reader's spate of hard drive failures leads him to wonder if it's just a run of bad luck or a sign of a more troubling trend in the storage industry

"In the past six months," writes Paul, "I have had to return eight Seagate drives for refurbishment. One was a 2.5, 80G drive, five were 500GB, and two were 750GB. One of these last is going back for to be refurbished for the second time in less than three months." All of Paul's drives were still under warranty, which is a good thing, but replacing hard drives after they have been installed and used to store data is no laughing matter. And isn't this an awfully high rate of failure? Is this just one man with very bad luck or some sort of trend?

"Until this year," says Paul, "I haven't only had to return maybe three hard drives over the course of five years." These were all Seagate drives. "I have used Seagate exclusively for the past 10 to 15 years," he says, "with good results after a rash of Western Digital drive deaths. Maybe the Maxtor purchase has resulted in high failure rate?"

[ Data loss, either because of a hard drive failure or for other reasons, is a pain. Read a firsthand account of how the Microsoft/Danger data outage affected one user. | Frustrated by tech support? Get answers in InfoWorld's Gripe Line newsletter. ]

I contacted Seagate to see if there is an explanation for Paul's bad luck -- perhaps it was a known problem like the one with the high-capacity Barracuda line earlier this year. Though I exchanged several e-mails with a representative there, I could get no official response in time for this post.

Paul's bad luck was not limited to Seagate drives, either. "I was so disappointed with Seagate," he says. "That I decided to try Western Digital again. In the past two months I purchased eight WD 750 Raid drives. Three of them were dead on arrival. My supplier is, of course, very sorry. But even though they have decent buying power, they do not seem to be inclined to hold either company accountable for supplying faulty hard drives. My customers, of course, expect me to repair, replace, and rebuild their systems when hard drives in them fail. But the manufacturer can send relabeled broken drives out as warranty service."

Without some sort of in-depth study (such as the one Google did in its own server farms a couple of years ago, which did find a high rate of mortality in hard drives), it's difficult to determine if this is a trend or simply one man who somehow inadvertently offended the god that watches over data storage.

"My supplier tells me they don't see a pattern that would indicate a problem," says Paul. "I must just be unlucky."

Is this just Paul hitting a patch of bad luck? Are others in the same boat? Have any of you seen a pattern of increased drive failure? Please let us know in the comments. (Comments on hard drives that exceed expectations are also welcome.) Not only will Paul's misery love to know it has company, but we might be able to identify a trend.

Got gripes? Send them to christina_tynan-wood@infoworld.com.

This story, "Have hard drives become less reliable?," was originally published at InfoWorld.com. Follow the latest developments in storage at InfoWorld.com.

additional resources
White Paper - How to Improve Delivery of Advanced Web Applications

White Paper

Virtual Workforce: The Key to Expanding The Business While Cutting Costs

Get the independent advice and expertise you need to support a virtual workforce.

Go inside:
The three-step approach to making a virtual workforce a reality.
The four flavors of client virtualization technologies.
The three key initiatives that solve IT challenges.
Download now »
White Paper: Successfully Secure Your Wireless LAN With Wi-Fi firewalls.

White Paper

Addressing Linux Threats Leveraging Fewer Resources

The increase in Linux popularity has increased the frequency and sophistication of malware attacks. Read this 2 page white paper now to learn how you can protect your Linux environment with real-time protection that is certified by all major Linux vendors.

Download now »
White Paper - The 2009 Handbook of Application Delivery

White Paper

The 2009 Handbook of Application Delivery

Ensuring acceptable application delivery will become even more difficult over the next few years. As a result, IT organizations need to ensure that the approach that they take to resolving the current application delivery challenges can scale to support the emerging challenges. This handbook elaborates on the key tasks associated with planning, optimization, management and control and provides decision criteria to help IT organizations choose appropriate solutions.

Download now »
White Paper - Is Your Backup System Outdated?

White Paper

Mid-range Storage Considerations

A common misconception is that mid-range storage requirements are dramatically different than that of a larger enterprise. Mid-range storage users may require less capacity, but they have similar functionality and management requirements. This ESG paper examines mid-range storage needs and reviews a new solution that adjusts size while retaining value, performance and functionality.

Download now »
Regaug 23-Oct-09 1:32pm
1 reply
Yes, in my experience, drives have definitely become less reliable over the last 5-10 years. But the price of storing a GB of data has dropped so much during that same period, and other facets of storage technology have improved so much, that drive failure should no longer be a cause for lost sleep. The biggest problem is the never-ending push by manufacturers (and we customers are guilty too) to have more and more storage space squeezed into the same (or smaller) form factors. At some point, the physics just aren't going to work anymore. Adjacent Track Interference (ATI) is a big killer of the new disk drives. Also, storage devices are becoming a commodity. That was one of the promises of RAID, after all, and inexpensive RAID controllers are also now widely avaialable. Using that approach, and backups are current, there's no reason that replacing a failed hard drive should be any more trouble than replacing a failed a simple thing.
willicueva 28-Oct-09 5:45am
I think you miss the point here: the price of the media doesn't matter if I lose the data. How many non-business users do you know who make it a habit of having raid systems or daily backups of their systems? If a hard drive goes bad, the customer loses Data, Time and most of all confidence in the vendor, not the manufacturer.
lcarliner 23-Oct-09 1:59pm
Could all of the outsourcing of manufacturing to China be a contributing factor? The best measure for now appears to be a RAID 1 setup, with the second of the mirror pair placed into service some two months or so after the first to minimize a potential multiple unit failure close together. Problem with RAID 5 is that multiple failure or a second drive failure before the replacement for the first has been rebuilt is far too risky.
CMadmin 23-Oct-09 2:35pm
This sounds like a case of a consistent environmental problem - either the drives are not sufficiently shock protected at the supplier, or in shipment to the customer, or else are getting abused somehow after they are received and installed. About 5 years ago I ordered 4 full height drives (closeouts) and 3 of them were DOA, but they were loosely packed. The supplier replaced the three drives, and the foam packaging was much more carefully configured in the second shipment. Drives in systems at my home, almost always factory installed in the equipment, have never failed. I've had server drives fail in RAID arrays, but either after gross shocks, or long term use (6-8 years) where I'm sure I'm validating the MTTF rating of the drives.
jeffq 23-Oct-09 7:57pm

One systemic factor that reduces reliability in large drives is that the bit-error rates tend to be decreasing more slowly (if at all) than the capacity in bytes increases.

I have a 1994 Seagate Hawk 1.2 GB drive with a non-recoverable BER of 1 in 10^14 (according to page 17 of its product manual. A service period reading the equivalent of the entire disk 100 times reads about 10^12 bits -- 2 orders of magnitude below this threshold. The odds are rather small odds of even a single problem occurring during this period.

By contrast, a brand-new ST31000528AS 1TB drive (3 orders of magnitude more bits) has the same NBER, 1 in 10^14, according to this blogged review. Reading the entire disk only 10 times actually reaches this threshold, so the odds of getting at least one non-recoverable error are probably quite near 100% in such a service period. (I don't have the formula handy, but if this sounds odd, note that probability here is a combinatorial problem, not simple division.)

If we were still using these new disks for ancient applications, this might not be a problem. But with video collections, monster databases, and bloated, do-all operating systems, one can expect to use newer drives enough to practically guarantee errors within a ordinary service period. Non-recoverable bit-error rates are no longer comfortably infinitesimal relative to disk capacity.

vonskippy 23-Oct-09 8:32pm
We deployed 80 new workstations in August, all using 3.5" 320G Seagate drives. Not a lemon in the bunch so far (knock on wood).
chdyoung 24-Oct-09 6:59am
Due to seemingly high failure rates for Seagate drives, I have started using Western Digital for some RAID replacements even though some of the Seagate drives have a 3 year warranty. My biggest troubles with drive failures are: 1. OS drives -- it is a pain to reinstall or restore the OS. 2. External drives -- I can only recommend that people not use these, but they do anyway, and then they want their data restored when it fails. Several times I have taken the hard drive out of the enclosure, usually still cannot restore the data, and then there is no warranty because the warranty is only for the complete unit -- not the drive inside. I am learning to tell people that their data is gone without trying any heroic measures that probably will not work anyway.
injury 25-Oct-09 6:12pm
Interesting timing on this article. Just a month ago I quit using Seagate after 3 different orders in a row had drives failing anywhere from initial install to within 2 weeks, on top of other random issues with Seagate drives in the past year. I decided to go to Western Digital exclusively (previously I had gone with the better deal of Seagate or WD at the time of purchase). Since the swap I've built 3 more PC's and one of those Western Digital drives failed within 3 days. Interestingly enough the WD drive that recently failed, did so in an almost ideticle way that a couple of the Seagates failed. I'm suspecting it's either shipping/handling issues or due to some of the failure types more likely some bad components that the drives may share.
blankreg 26-Oct-09 3:12am
1 reply
Just a note... most of my hard drive failures in multiple disk installations have been caused by inadequate power. In other words, the power supply module in the computer needed to be upgraded. Some drives require more power than others, and a marginal power supply seems to make the hard drive that requires the most power to look as if it has failed. Testing the same hard drive by itself in a test bed computer showed that the hard drive had no problems at all. Putting that same hard drive back in the original system with a higher watt rated power supply solved the failure problem. Just saying..... it may not be a hard drive issue at all.
martyhaas 27-Oct-09 2:32pm
Is there an easy way to know how much power my system really needs? I have an off-the-shelf Compaq to which I added a second hard drive and a video card. I assume that these low-budget computers have inadequate power supplies, but I wouldn't know how to assess what everything in it requires.
talmy 26-Oct-09 5:55am
I represent a small sample (9 computers in service now plus about an equal number of external drives used for archival purposes). I've had exactly one drive failure in the past 10 years which was on a desktop system that was about 2.5 years old at the time. I've had far more optical drive failures. My feeling is that computers have never been more reliable.
EdK 27-Oct-09 10:36am
I just had a Seagate ST3500641AS-RK fail with less than 2 years since installation in a Dell 5100 chassis. Relatively idea conditions, the chassis never would need to kick the cooling fan to high - it always was cool and the machine rarely ran overnight. I probably had less than 500 hours of operation on that drive. What was interesting was the failure mode. I started having errors and let Spinrite 6 run on it for over 24 hours which got half way through the drive. (I wish Spinrite had a "give up after" option to configure the number of retries.) I pulled the drive, ran it on an external USB cable with an external power supply and fan while I pulled data off of the drive. I had about 320GB of the 500GB capacity used, but much was duplicated on other backups. I did pull off what I needed and watched the drive become less and less accessible until it was unable to be used. (This occured as I was ready to clean the drive prior to shipment to Seagate for replacement.) It's spot in the Dell chassis was taken by a new Western Digital 500GB drive which is one of the "Green power" drives and is much quieter than the Seagate. The replacement drive has been put into an Ultra external USB/ESATA case for use in backup. (2nd or 3rd level, I won't trust it as a primary backup.)
MAS 27-Oct-09 10:44am
My $0.02 on HDD:

After a total set of disasters with many Maxtor products, especially Maxtor HDDs, they were "banned from the building" at my company. By extension (after absorbing Maxtor), and after a few bad experiences with Seagate Seagate was similarly exiled.

Since then, we have migrated to mainly Fujitsu, Hitachi, and Toshiba. We have since experienced such a drop in HDD failures we don't know what to say.

We have had mixed results with some computers that have WD HDDs installed, but not enough to suspect that the problems are with the HDD alone.

The HDD business is tough, and I don't know how long the various companies can keep shooting holes in the bottom of the HDD boat before something gives. HDD technology progresses at a rate comparable to (and sometimes exceeding) that predicted by Moore's law, but HDDs are mechanical in nature.

I would not be surprised that the HDDs are eventually replaced with solid-state devices, like the SSD drives. In fact, many of the new laptops have them, and many people are replacing the factory HDDs with SSDs for speed and G-force resistance.

egilbert3 27-Oct-09 10:48am
The newer generation of larger drives seems to generate more heat, and heat can cause them to fail. I make sure that a case fan is blowing air over my disk drives. Many cases have a easy way to add a fan in the front and hook it up to your power supply, sometimes by plugging into the power before it gets to a disk drive. I use a 3-speed fan set to low so the noise is barely increased. Only a little air movement seems to be needed, but some air movement is definitely necessary. I don't have before-and-after numbers for disk failure, but it's clearly made a bit difference. With fans, it's gone from a lot of failures to almost none.
wayne 27-Oct-09 10:54am
I have had 3 WD drives die this year at different customer locations. One drive is so bad that while it can be read it has so many errors the WD drive test will not even run. I prefer Hitachi or Samsung and for China the drives that failed cam from Singapore?
prescod 27-Oct-09 11:02am
There's certainly a public perception that inexpensive external hard drives are unreliable. Everyone I know who has one has had it fail. Everyone. Then my turn finally came to replace a bad drive. I tried to research a suitable replacement on BestBuy.com and Amazon.com. If you go there and search on major hard drive makers (Seagate, Western Digital, Maxtor), you'll wind up reading hundreds and hundreds of sour user comments and complaints. I had to give up after a while. I think the cheap commoditized consumer-oriented drives may have issues (cheap parts, cheap labor). They're so cheap, maybe the manufacturers consider them disposable now (replace every year or two). Anyway, to be safe, I bought 2 cheap Seagates and use one as a backup for the other.
Trencher93 27-Oct-09 12:02pm
I think everything is less reliable in general. Over the past few years, when I build a computer, I almost always have to RMA some DOA part. I've gotten tired of paying shipping twice to get an item and then return it. Especially bigger things like motherboards. A company has enough money to order redundant parts and have spares, and the shipping isn't as costly for them, but as an individual software developer I've started dreading ordering any computer parts. I always wonder what is going to be DOA.
RayBay 27-Oct-09 12:15pm
We have computer repair centers in four states. We have kept records on all hard drives since 1991, and have a database of of 19 years by brand and model for those sold or repaired in our shops... 14,941 desktop and laptop hard drives. Nearly all brands have increased reliability as the years go by... but certain brands have been disasters: Connor, Tri-Gem, Maxtor, Hitachi, DiamondMax and several others... most are out of business due to their failures. Others such as Seagate, Toshiba, Western Digital, Fujitsu, and Samsung, continue to become better and more reliable. Most tech shops can tell you which ones are great, and those that are lousy. Sales of replacement hard drives are a verification of reliability. We buy the ones we know are good. Technicians would not put up with junk drives and early failures. That would cost us all money, and make owners hate their computers even more than they already do...

Sign up to receive InfoWorld Resource Alerts

Subscribe to the Today's Headlines: First Look Newsletter

Find out what will be news for the day, with our first-thing-in-the-morning briefing.

©1994-2010 Infoworld, Inc.