Digg Townhall now online!
Check out the latest Digg Townhall, where Kevin and Jay answered the top questions from the Digg Community!
Experts: No cure in sight for unpredictable hard drive loss
arstechnica.com — Google researchers released a fascinating paper called "Failure Trends in a Large Disk Drive Population" that examined hard drive failure rates in Google's infrastructure. Two conclusions stood out: self-monitoring data isn't useful for predicting individual drive failures, and temperature and activity levels don't correlate well with drive failure
- 707 diggs
- digg it
- bitterg, on 10/12/2007, -1/+19The day is coming when RAID or something very RAID-like will be common in PCs, for this very reason. With data protection (mirroring or parity, it doesn't matter), hard drives will become true throwaway commodities.
- Software2, on 10/12/2007, -1/+43Do you know how long people have been saying that?
Let me put it this way. Ask your average Joe if he wants 100 GB or 200 GB for the same price. Do you think he cares about these "raids" or whatever? No, he just wants the most megabytes on his processor! - rmxz, on 10/12/2007, -5/+16RAID is not the answer, for a number of reasons:
* Raid doesn't protect against raid controllers dying.
* Raid doesn't protect against some types of disk errors (where a write claims to have completed, but writes an incorrect value which goes undetected/uncorrected)
* Raid doesn't protect against user error.
* Raid doesn't protect against incorrect data written due to problems like bad memories; bad cables; bad DMA controllers etc
While RAID is nice for the most common kinds of disk failures; after a couple unrecoverable errors I've put a lot more faith in incremental backups done daily to a computer in a different room/office/facility than RAID in one box. And it's sad to see that the most popular raid (RAID5) is uniquely poor for recovery: http://www.miracleas.com/BAARF/RAID5_versus_RAID10.txt
At home, having automated nightly backups to an old computer in my garage (google for the articles on using rsync to do incremental backups) I can recover from most any failure - including user error; and the total loss of my desktop machine. Having weekly rsyncs of that to an encrypted external hard drive in my car, I can even recover from the whole house going down to a flood/fire/etc.
http://www.mikerubel.org/computers/rsync_snapshots/
At work we do similar; with full redundancy for everything except databases; and replication/hot-standby databases. The only RAID most of these machines is using is RAID0 for performance. Data that is especially important gets mirrored/replicated to more than 2 servers at more than 1 location. - fkr2, on 10/12/2007, -1/+11Average Joe might care and even understand when Dell offers it, as they do right now in domestic pc's, with words like
250gb (security) 2 x 250gb hard drives
500gb (speed) 2 x 250gb hard drives - jackhole, on 10/12/2007, -2/+15RAID is not a backup method. The first letter in RAID stands for Redundant, not Resilient. RAID ensures sustained availability of data resources in event of individual disk failure. It is a poor choice for protecting data because every disk in the array is still susceptible to data corruption at the array controller, and hardware failure due to power spikes and environmental factors such as vibration or moisture. If you want to protect your data you must back it up to unpluggable and firesafe-able external data storage: tape drives, external hard drives, DVD/CD-Rs, etc. Don't use a square peg for a round hole.
EDIT: rmxz kind of beat me to it. - computergod, on 10/12/2007, -0/+5It depends if average Joe has lost a drive before. That has a way of making people care about the integrity of their data. Naturally this means that people will end up buying a RAID 1 system, losing a hard drive, then the other, then complaining that the RAID didn't work.
Link to /. article on this paper:
http://hardware.slashdot.org/article.pl?sid=07/02/18/0420247 - yournamehere, on 10/12/2007, -16/+3Software: " the most megabytes on his processor"
what are you talking about processors for? - fkr2, on 10/12/2007, -4/+1@ jackhole
By it's very definition redundancy is a form of backup.
"If you want to protect your data you must back it up to unpluggable and firesafe-able external data storage: tape drives, external hard drives, DVD/CD-Rs, etc. Don't use a square peg for a round hole."
We were talking about Average Joe - that dick who probably still backs up his important stuff on 3 1/4" disks. Expecting mumsie or pops to buy a tape drive / usb or firewire drive / image their ***** onto dvds or cds is only going to increaes your workload - buying, setting it up and then doing it for them every other week/month/whatever. - jackhole, on 10/12/2007, -0/+6@ fkr2
Fine. "By its very definition redundancy is a form of backup." It is also the worst possible form of backup, for the reasons I and rmxz outlined.
A hard drive + external caddy is no more expensive than a hard drive + dedicated raid chip. Additionally, because it is portable, it is also more useful on the consumer level than a redundant internal disk, not to mention it is infinitely easier to set up. The best backup plan for "Average Joe" is to get a usb pen drive and/or external hard disk and copy "My Documents" to it every week. - fkr2, on 10/12/2007, -1/+1I know what you're saying and I even agree with you. I just couldn't see my friends or family doing it. I could see them checking the box that says "security" and offers mirroring, the lamest of backup methods, but one they don't need to comprehend or do anything to have.
- raindogmx, on 10/12/2007, -1/+1The point is that they dont become very "throwaway" like floppies did. Floppy readers and disc quality dropped enormously. Sure they became cheaper than air but the last ones I used worked only for writing and when you wanted to read from them they were useless.
- FryedGuy, on 10/12/2007, -3/+0@yournamehere
It was a joke that went clearly above your head. Oups. - ungamedplayer, on 10/12/2007, -0/+8Its like I've always said.
Drives exist in three states,
About to die.
On fire.
Dead.
Consider this when you think "should I consider backups for this computer... ?" - dubbleenerd, on 10/12/2007, -0/+2as long as we have marketing folk who advertise storage media purely in terms of the number of copies of the Library of Congress that can be stored on it, we will always have people who would pick capacity over reliability.
- colonelpanic414, on 10/12/2007, -2/+0I swear when I saw the title for this in my RSS reader, it got cut off in such a way that I thought it said "Experts: No cure in sight for unpredictable hard-ons"
/I'll take a window seat. - earl507, on 10/12/2007, -0/+3@jackhole
I have a raid 5 setup at home and you know how much extra it cost me? 1 extra disk. That's it. No raid controller, no fancy raid motherboard. Everything runs off the onboard sata. It also checksums every bit of data on the disks and performs a bi-mpnthly sweep of the data to validate that all bits match the checksums and that the parity disk matches everything. It will fix any errors that it can on the fly with the checksums and parity disks. If major errors start to appear, it will report it to me and I can swap out a disk with a simple reboot or if I hook up a hot spare, the system can start using the spare without an reboot. I also have snapshots to protect against erased files (although if you erase a file before a snapshot picks it up, you are sol). It is also a cached COW raid which makes for very impressive performance.
Point is this, much of what is being bitched about on this thread regarding raid is true for some raid setups, but certainly not all. Raid does not have to be expensive or require expensive controllers.
I will agree with most of the posters here that say that raid is not the perfect solution - it isn't. You need offsite backup for important files; however, a cheap raid setup can protect your data from 90 percent of failures which are usually simple drive failures. Things in the other 10 percent (fire, lightning, flood, burglary, etc) require something offsite. - chijim70, on 10/12/2007, -3/+1RAID as a backup system??? I use raid for read/write speed not backup. As for wanting big drives... I attempt not to have any drive any larger than the space needed to have an OS/Applications/current projects on it. The reason is that drives are becoming less reliable than ever before and DVD's are cheap... drives are cheap too so clients I have who work on video and need massive storage without burn times interrupting work flow find that just backing and storing to firewire drives and even sending them to clients etc to be a best of time/cost/benefit solution. This being a storage solution that means the drive will actually be used to an extremely minimal extent so it doesn't really need to last through heavy usage.
I think from what I read in some comments here is that many people do not realize that RAID is a generalized term covering many ways and reasons for 2 disks to have mirrors of each other. If using a controller card for 2 drives (I'd say this is true raid) then you are doing it for read write times due to heavy traffic to that array which is usually networked clients working off a server. Mirroring data to another drive for backup means you are not actually using the backup copy and technically may be called RAID but is not to my understanding what the original purpose of RAID is AND increases write times without increasing read times if you are not using a RAID card. Having a card increases read and write times and generally means if one drives data is hosed (tech term there) so is the other... - chijim70, on 10/12/2007, -0/+3Arg, didn't hit edit quick enough. For the last part there. RAID via a card does not increase write times and decreases read times.
- Software2, on 10/12/2007, -1/+43Do you know how long people have been saying that?
- ray901, on 10/12/2007, -3/+10FTA
"Two conclusions stood out: self-monitoring data isn't useful for predicting individual drive failures, and temperature and activity levels don't correlate well with drive failure"
In other words the drives do not fail due to normal testable working conditions, they fail because of design and/or materials failure. i.e. we are still bad at making hard drives (except for capacity - that is where all the r&d is). So have two or more instead.
The same approach was used for dual core chips - we were not good enough to come up with a truly outstanding OS that could perform multitasking properly - so the solution is to provide two cores, the multitasking efficiency of the OS is then irrelevant.
/not saying hardware failover is a bad thing but we should now why it is being touted and sold to us- an0nymous, on 10/12/2007, -0/+5all of which were consumer-grade serial and parallel ATA units with.."
Consumer rather than enterprise grade? @ google? In their production environment?
Is this a strategy, "less reliable but cheaper" vs "bulletproof but expensive?" - darkstar949, on 10/12/2007, -0/+7@an0nymous - Google uses consumer grade parts due to the fact that they are cheaper and just uses a massively redundant system to ensure the reliability of the system as a whole. It has been mentioned a few times before, but Google doesn't bother trouble shooting device failures - they just replace the parts with another box that is already sitting idle waiting to take the load.
- mitrovarr, on 10/12/2007, -0/+4Actually, I'd say the reason that processor companies are moving to multiple cores is because it's much easier two make two cores than it is to make a single core processor that's twice as fast. Dual cores actually rely on intelligent software design to make them work correctly (applications must be properly multithreaded.) Poor programming can reduce their benefit.
- ipodsweatshop, on 10/12/2007, -9/+0"bulletproof" what a douche. Why don't you go buy all enterprise SCSI drive you idiot, or maybe you could read that Google found that they are no more reliable than regular SATA drives. You really have no concept of scale or stuff on the scale that Google is using, so shut up and go back to doing something 1337 like burning CDs.
- an0nymous, on 10/12/2007, -3/+9@ipod
Let me begin by saying you are a tremendous dick. A huge phallus.
Now then:
**
""bulletproof" what a douche."
What's the problem wiith using the term bulletproof? I meant of greater reliability. Did you not understand?
**
**
Why don't you go buy all enterprise SCSI drive you idiot, or maybe you could read that Google found that they are no more reliable than regular SATA drives.
**
On my home computer I actually run a seagate scsi3 raid5 array (80 to 68 pin adaptors) attached to a adaptec 19160. Silly to use it for a desktop system, but my OS is on a Raptor 75gb. The reason I run them is that they provide redundancy for my data and that they were free (outmoded kit from my job. At a large data center, that is.)
I hadn't read that Google found no bonus for consumer vs enterprise grade storage. Interesting. Is there a citation?
"You really have no concept of scale or stuff on the scale that Google is using, so shut up and go back to doing something 1337 like burning CDs."
Maybe not. I have been to the EMC Research Triangle Park facility and I found that quite impressive. They run SCSI 320 connected to SCA2 in their test lab, for those that are interested. But I am asking questions to learn. I cede the mantle of 1337ness to you, though "ipod".
In summary:
Go ***** Yourself, you ***** child.
Blocked. - bradleyland, on 10/12/2007, -0/+3"Is this a strategy, 'less reliable but cheaper' vs 'bulletproof but expensive?'"
It appears that this may be their strategy, almost to the tee. Following is a SPECULATIVE article (capitalized as not to be confused for fact) about Google's storage strategy.
http://blog.topix.net/archives/000016.html
The important bit:
"Google has 100,000 servers. [nytimes] If a server/disk dies, they leave it dead in the rack, to be reclaimed/replaced later. Hardware failures need to be instantly routed around by software.
Google has built their own distributed, fault-tolerant, petabyte filesystem, the Google Filesystem. This is ideal for the job. Say GFS replicates user email in three places; if a disk or a server dies, GFS can automatically make a new copy from one of the remaining two. Compress the email for a 3:1 storage win, then store user's email in three locations, and their raw storage need is approximately equivalent to the user's mail size."
- an0nymous, on 10/12/2007, -0/+5all of which were consumer-grade serial and parallel ATA units with.."
- dougsnell, on 10/12/2007, -0/+8Until customers make something other than "How big?" their first question, that's all that matters.
What's the non-sense about hard drive manufacturers not wanting to make better hard drives because they need to sell hard drives? They're too focused on "How big" to worry about anything else because every few years the home user is out of room and gets a new computer. Once manufacturers can out-pace the home user's rate of consumption of space, they'll move on to focus on another selling point (like reliability ... or, more likely, cost).- ScottyMo, on 10/12/2007, -1/+6The trends are already in place. The next selling point is not reliability or cost, but size (not capacity, but actual dimensions). Unfortunately, they're moving toward getting larger capacity in smaller spaces (look at iPod/iPod nano, cell phones, other portable devices) and while cost may come down as a side-effect of the size R&D money, unless something big happens, quality is still coming up short in the priority scale.
- rslc, on 10/12/2007, -1/+3Good that google bring out the Quality issue.
Just like how PSU are so inefficient until Google bring out this issue (of cos global warming plays apart too)
- baalzebub, on 10/12/2007, -6/+3i keep two small hard drives one 40 gig and one 80 gig, i rather keep two smaller drives than one BIG drive...
- willjc, on 10/12/2007, -1/+6there's no real reason to have a single drive that holds 1 terabyte+ if the standard is still the equivalent of 7200RPM and modest read/write speeds, and will likely fail in a few years.
- austindkelly, on 10/12/2007, -2/+9@baalzebub: yeah but where do you keep your pr0n?
- jackhole, on 10/12/2007, -0/+6@ willjc
You are assuming that just because the rotational speed of the platter is constant performance does not increase. Most of the R&D required for these larger hard drives goes towards making platter sizes larger, via making data "size" smaller. You can see this in the new perpendicular storage methods now coming to market. As data density goes up, so does performance, because (1) there are less platters for the drives to juggle and (2) the head needs to cover a smaller space to read/write data. Space and performance are correlated in hard drives. Quality, as mentioned, is not.
- TortfeasorG, on 10/12/2007, -5/+13Experts: Tautological Headlines Validate Themselves
- TortfeasorG, on 10/12/2007, -7/+2http://en.wikipedia.org/wiki/Tautology_%28rhetoric%29
- superkendall, on 10/12/2007, -2/+4RAID mirrored with a third unit to swap out is simply the best way to go - even if both drives in the mirrored RAID array die at once, the swapped out one is probably going to be good long enough to at least make a copy. And you can keep the third drive offsite.
Five disk RAID arrays may be cool but as this study shows they really are more risky, especially for the home user.- martin308, on 10/30/2007, -1/+3Where does it say that RAID arrays are more risky? I would have thought a mirrored or level 5 array would be safer. Thats what I'm moving to
- rmxz, on 10/12/2007, -0/+2@oni308 : "Where does it say that RAID arrays are more risky? I would have thought a mirrored or level 5 array would be safer. Thats what I'm moving to"
Here's a good article - short summary - for some forms of disk failure RAID5 silently returns garbage.
http://www.miracleas.com/BAARF/RAID5_versus_RAID10.txt
"The problem is that despite the improved reliability of modern drives and the improved error correction
codes on most drives... it is more than a little possible that a
drive will become flaky and begin to return garbage. This is known as
partial media failure. ...When a drive returns garbage, since RAID5 does not EVER check parity on
read (RAID3 & RAID4 do BTW and both perform better for databases than
RAID5 to boot) when you write the garbage sector back garbage parity will
be calculated and your RAID5 integrity is lost! " - superkendall, on 10/12/2007, -0/+2Mirrored RAID is fine (just remember it is not a backup, thus the third drive which being out of the array is a true backup).
RAID 5 (or other variants of parity checking) are where the danger lies due to one drive failing, and then another drive failing during a rebuild. The google paper I think talked a bit about that, at least I remember reading that part in there before.
I currently use OS X software RAID, but hardware RAID is really better if you can afford either a card or enclosure that supports it.
- trollick, on 10/12/2007, -7/+0OH NO! I never had any major problems with my HDs, but I guess now I'll start having them because google said so.
- spliznork, on 10/12/2007, -1/+5"and temperature and activity levels don't correlate well with drive failure"
Not exactly. The Google report says temperature *does* correlate with drive failure, just that the correlation is inverse with expectation -- drives that were not actively cooled and thus ran hotter failed less than drives that were cooled.- rslc, on 10/12/2007, -6/+1I seriously suspect the thoroughness of Google harddrive report.
I had seen HDDs used in Security DVR application FAILING ALOT more often than those used in PC.
Even the performance of such high-activity drives drops.
In such Security DVR applications, the drives usage is heavier than google servers.
Moreoever, I had also encountered 7200rpm drives FAILING MORE than 5400rpm drives.
Its a no brainer. Is it easier to crash a car at 100mph or 60 mph??
Just ask some harddrive engineers or scientists.
Now, u can't even find a 5400rpm drive, which is really more reliable. - Software2, on 10/12/2007, -0/+6@rslc
Wait... did you just say that a company that stands to save/lose millions of dollars on hard drives just threw together some numbers? - mitrovarr, on 10/12/2007, -0/+2Wow, that's really good to know. Maybe I can take the obnoxiously loud aftermarket hard drive cooler off my drive.
- superkendall, on 10/12/2007, -0/+2The difference in RPM's would really have little to do with the likelihood of a crash - it's not the head that's moving at 5400 RPM, it's the platter. The platter is not what causes the crash, movements of the head into the platter cause the crash. At any RPM above 30 that's going to cause some damage and mess up the head.
It's not faster drives that are screwing you over. It's drives being made with cheaper components and worse QA. Even if you could buy a 5400 drive today for your application it would still probably fail as often. - computergod, on 10/12/2007, -1/+1Most security DVR devices use hard drives specifically modified for that and many servers use drives modified for that. Looks at the Segate AV and NS class of ATA drives.
However there has been doubt cast on these drives in the survey it's self, so my point may be moot.
- rslc, on 10/12/2007, -6/+1I seriously suspect the thoroughness of Google harddrive report.
- Mootabolife, on 10/12/2007, -2/+2Who needs a cure when the businesses take in so much money?
- jman8888, on 10/12/2007, -1/+3Theres always a cure
- toxonix, on 10/12/2007, -0/+4Smaller, denser, faster, cheaper, lower power NVRAM. If I could get a 80GB NVRAM chip to replace my hard drive with, I'd jump on it.
- Namco, on 10/12/2007, -8/+1"If I could get a 80GB NVRAM chip to replace my hard drive with, I'd jump on it."
If I could get a 80GB NVRAM chip with which I could replace my hard drive, I'd jump on it.
...I am so deeply sorry. Not sure why I had to correct that, but I had to.
- Namco, on 10/12/2007, -8/+1"If I could get a 80GB NVRAM chip to replace my hard drive with, I'd jump on it."
- benitojuarez, on 10/12/2007, -1/+4Best way to avoid uppredictable hard drive loss is to not buy that dirt cheap hard drive (most likely a maxtor) thats 70 dollars or so less then all comparable models. Its cheap for a reason.
- jackhole, on 10/12/2007, -1/+1I need to find out where you're buying your hard drives from.
- Namco, on 10/12/2007, -0/+4I haven't had a Maxtor fail in..... well, I've never had one fail. Western Digital and Toshiba on the other hand....
Maxtors ran hot and were noisy back in the day, but they're pretty damn reliable now. - benitojuarez, on 10/12/2007, -2/+2we use maxtors at work, we have at least 1 fail a week. And every time in the last 7 years when someone on irc says "my hard drive crashed and i lost everything" every. single. time. its a maxtor.
- tarball, on 10/12/2007, -2/+2As progress is made increasing capacity/speed of flash memory/solid state storage, spinning magnetic disks will become a thing of the past.
- Joe_rigby, on 10/12/2007, -0/+3Spinning magnetic discs and light bulbs will give way to solid state and led diodes.
- Software2, on 10/12/2007, -0/+3I could pull up countless stories since the earliest times of computing when they said the exact same thing. Yet here we are with 1TB hard drives.
- skyfire1, on 10/12/2007, -0/+5I love google. If google was a girl she would be the hottest chick ever. Now in the odd chance that the ESA comes knocking on my door, I'll have something that will explain the hard drive loss.
- Software2, on 10/12/2007, -1/+2Hmm... a girl that gives you free porn, but only tells you how to make a sandwich, and doesn't make it for you.
- rip747, on 10/12/2007, -2/+3I know this going to sound dumb and like spam but so be it.
this is why I use Carbonite.com. For $5 bucks a month, they back up the important stuff on a daily basis. Personally I don't have the space or the time to be switching tapes and checking backup logs to make sure the backup went through. Granted RAID and tapes are great, but for the money you can't beat adding that extra layer of protection.
There I said, digg me down if you wish.- ThatsNotPudding, on 10/12/2007, -0/+6So what kind of hard drives do they use?
- lpmiller, on 10/12/2007, -1/+6Who is this Average Joe, and why don't we just punch him in the crotch repeatedly?
- Jugalator, on 10/12/2007, -1/+1Not sure, but he sure is less smart than my rather unskilled parents, and it seems we always have to adapt software for him.
- spaztacular, on 10/12/2007, -4/+1Dupe
http://digg.com/hardware/Google_study_on_failure_trends_in_hard_disks - TonyCubed, on 10/12/2007, -0/+5Would be nice to see what manufacturer failed the msot.
- totorototoro, on 10/12/2007, -1/+4Wonder how much the Hard Drive manufacturers paid Google not to reveal which drive brand had the highest failure rate.
- mahmoodsdotjpg, on 10/12/2007, -0/+2google doesn't need their cash. there's gotta be some good reason for it though. possibly so X manufacturer can't say "endorsed by google" or somesuch.
- Namco, on 10/12/2007, -0/+3So my poorly ventilated Tivo with the 2 120GB Deathstars (still going strong after 3 years), are actually going to live longer due to heat buildup? Awesome!
- DupeAHolic, on 10/12/2007, -1/+1All drives fail because they have moving parts. It doesn't matter what make or model you have, they all fail. The only solution is to backup everything you can't afford to lose. There are several popular methods available to back up your data:
1.) Media backups (CD, DVD, Tape)
2.) External USB hard drive /w software
3.) RAID (1, 5, etc.)
4.) Offsite storage
Method 1: Media backup
Pros:
- Dirt cheap
- You probably have all the hardware you already need
- No monthly recurring charges
Cons:
- Manual process of using discs makes daily backup tedious
- Cost of media adds up
- Media will fail over time as well
- Backup copy is located on site
Method 2: External USB Hard drive
Pros:
- Cost per GB is good
- No monthly recurring charges
- Fast for large backups
- Usually comes with free backup software for automation
Cons:
- Uses a hard drive that will eventually fail
- Backup copy is located on site
- Higher initial cost than media backup
Method 3: RAID
Pros:
- Completely automatic
- All backups are done in real time
- Allows user to continue to work normally after a drive has failed
- Little or no downtime to replace and rebuild a drive
- No monthly recurring charges
Cons:
- Does not retain copies of old versions of files
- Requires at least 2-3 drives; higher cost
- Slows down read/write performance
- Uses a hard drive(s) that will eventually fail
- Backup copy is located on site
Method 4: Offsite backup
Pros:
- Completely automated
- No hardware costs
- Allows for old versions (snapshots) of files to be used (dependent on software used)
- Offsite storage protects against local disasters
Cons:
- Recurring monthly fees
- Backups can be slow for large data
- Limited space - mwsherman, on 10/12/2007, -1/+6My Firefox live bookmark had the Digg headline as, "Experts: No cure in sight for unpredictable hard..." .
- hunter186, on 10/12/2007, -2/+1This has been one of the most informative digg threads that I've ever read
- mitrovarr, on 10/12/2007, -1/+1In all seriousness, I'd say RAID 1 has completely wrapped up the problem of unpredictable hard drive loss - drives are dirt cheap now and you can get two drives with plenty of capacity for just over a hundred dollars. Sure, RAID 1 is not a complete replacement for backups, but it does completely insulate the user against having a single drive fail. The only thing the user has to deal with is the cost of replacing the drive, and warranties are getting pretty long on the more reputable drive companies.
- sagemane, on 10/12/2007, -0/+3There should be no debate about whether backups are better than RAID: backups and RAID are for completely different purposes. Backups are for data protection, RAID (except 0) is for data availability. In most cases you can rebuild your RAID array and you won't need to use your backup, but that is a just nice side benefit of RAID; you still need something to restore from when your users delete files or something goes wrong that RAID can't fix.
EVERY system that contains non-disposable data needs to be backed up (in two locations); systems that contain non-disposable data AND need to have high availability need to be backed up AND RAIDed. And maybe have redundant RAID controllers/NICs/PSUs/etc., diverse network connections, and a mirror system in a different location. And any availability you add to the live system needs to be in the backup too; what good is having an array that survives a disaster if the only remaining copy of your data is on one set of tapes that will take a month to restore? Which is why this isn't to say that you shouldn't RAID most everything: you should because it will let you recover from most issues quickly and with less service interruption, just don't make it the last line of defense.
ZFS has good solutions to many of these problems because you can quickly make arrays with variable levels of parity, easily mirror them, and take snapshots as often as you want, giving you protection (mirrored snapshots) and availability (parity and mirroring) at the filesystem level, and these things are available in many other solutions as well.
What trends like that mean is that there won't be a big push toward more reliable drives (not that drives won't get more reliable as a function of improved technology, but that we won't focus on reliability at the expense of large amounts of price and capacity). Especially in Google's case their infrastructure philosophy is based around lots of cheap, redundant parts, not fewer, more expensive and slightly more reliable parts. If companies like Google weren't looking at innovative ways to get the most bang for the buck out of these affordable drives and decided to keep using top-of-the-line enterprise drives for everything because they're more reliable, we'd likely still be using 100MB email accounts. - frozen1, on 10/12/2007, -1/+2Flawed study... Modern drives without fans get very hot with extended use. In a data center this is probably not so much of an issue, but for personal use I would definitely peg drive failure to temperatures and 24/7 use as the bonds of chemical structure slowly warp and decay overtime. Drives are MECHANICAL failure is to be EXPECTED.
- bpwagner, on 10/12/2007, -0/+2Not sure if I am doing the correct thing, but I use raid 1 (mirror) for my main computer, I back up off site (Mozy) and use seagate 'ES' drives. I hope that is good enough.
- eecue, on 10/12/2007, -1/+1I have a cure in sight, how about solid state storage? Remove the moving parts and your storage will not fail (as much).
- ltcdata, on 10/12/2007, -0/+1i have a 120gb hd in my pc.
I have another 120gb hd stored in my house.
Once a month, i ghost the disks.
So, if power failure, read element failure, anything happens to the hd inside my pc, the stored one is ready to go.
Also, i backup the most important things on optical media.
For me, it's the best way.
The next hd i will buy is a WD 320GB. I will buy TWO. One for the data, the other for the ghost image.
Pricey? Yes.
Secure? Yes.
If you think your data is not worth the price of another hd, you will change your mind when your hd dies. - abcDataRecovery, on 07/06/2008, -0/+0The best way to ensure data is protected at all times is to have duplicate sets of data in its simplest form when using hard drives you can mirror the drives.
But when it come to business data, mirroring seperate Raid sets and storing on nas / san storage is one of the few ways of on site redundancy, next step is online backup and file replication or the virtualisation of servers could be the way forward if you have the funds.
we see numerous Raid servers with 2,3,4,5 or even 8 drives and administrators can still experience dificulties when there are power problems or a seized spindle occurs.
We rebuild crashed drives recover and image the drive to a new one and rebuild the array often replacing the server to the point of failure including the operating system, but where it has caused surface damage it is more practical to just recover the data and any essential configuration files and move the recovered files to a fresh installation of the servers operating system.
Hard drives as we know them now will continue (it is predicted) from 1TB to 6TB but by then
these drives will mainly be for people wanting such amounts of storage and the new ssd drives will replace the domestic drives as demand grows.
However all the sensible data protection methods should still be adhered to and the server 2008 file replication is a big improvement on past versions especially with regard to databases and exchange server EDB files which can now be replicated live.
We replicate our recovered data for clients accross 2 seperate data centers to ensure that the recovered data is preserved until the client is happy they have their recovered data back.
this is purely a comment not advertising - abcDataRecovery, on 07/06/2008, -0/+0Be aware SSD drives are no more than high capacity memory sticks and suffer from similar problems. We see SSD drives with corrupt data and direct access issues, do not assume that these will be any more reliable than mechanical drives, the technology relies on similar MTBF (mean time before failure rates) and is sensitive to static and electrical discharge as well.
However that is not to say that technology has reached its end, there are numerous developments which will see both the speed, storage capacity and energy efficiency of data storage improve.
20 Years ago maybe 1 out of 20 drives manufactured was reliable enough to leave the factory.
Nowadays this is more like 99.999% but only because programming at manufacture level of onboard chips to remember any a faults on the platter surface to be avoided are incorporated, but explains why such close matches of firmware are essential starting points when preparing the repair or rebuild of any hard drive or data storage media inc memory sticks.
Check out the new & improved