Troubleshooting: in computers, no one can go very long without being forced to do at least rudimentary attempts, but I’ve found few books or other resources that help teach the mental process. After almost two decades of working with computers, since my first IBM XT at the ripe age of 15, I’ve been dealing with problems of every type. Most of the computer jobs had a large helping of troubleshooting, which was great for me. It’s something I’ve always enjoyed. I thought I would take the opportunity of my girlfriend’s hard drive crashing to walk through one real-world example, and hopefully provide some insight into this sometimes archaic process.
So the other day my girlfriend tells me her pc has been locking up at odd times. It’s one she built herself a couple years ago, and has been running well for that time. I asked her what kind of lock-ups they were; typically hardware failures cause things like an instant reboot or power off, or the entire screen locking up and being completely unresponsive. That’s a ‘hard lock’. Often people will say their machine locked when they really mean it was very slow to respond, or they could move the mouse but not much else. Those are typically software issues. Hers was a hard lock, and she could go hours without any problems.
Identifying the Problem
With hard locks, any piece of hardware can be the cause, even an external peripheral. There are two basic approaches to troubleshooting – ‘the usual suspects’ and deduction. The former involves trying things that worked for you frequently in the past, while the latter is an ordered process of eliminating things that could be the cause to narrow the list of suspects. I asked if she had run a scandisk lately, and she hadn’t. This was kind of a lucky hit and saved us some time. We set windows xp pro to do a thorough scandisk when it rebooted on both her boot drive and her data drive. Sure enough it started finding errors on the data drive and was fixing them, but could never complete the scan. The pc would always lock up, usually at the same point. This made me pretty sure there was something going on with the drive, and that it probably wasn’t easily remedied.
Drive triage: can it be nursed back to health?
In my experience, hard drives seem to fail several times more often than all other pc components put together. It’s not surprising considering it’s the only part in a computer that has moving parts (not counting fans). I’ve seen dozens of drives fail over the years, with motherboards coming in a distant second – I’ve seen 3-4 of those go. So it’s the least reliable piece of hardware, and at the same time the most crucial. When other parts fail, you can just swap ’em out, but drives are a pain. First, I wanted to see if I could repair the damage on the disk, or at least complete a scan. Drive scanning utilities have the ability to mark bad sectors on a disk so they’ll never be accessed again, so only the worst hardware failures will condemn a drive.
I tried several more times to complete a full scan of the drive, but it always locked up. So I pulled out a boot diagnostic disk called ‘Techie’s Toolkit 2’. I can’t remember where I got it; I Googled it and found the ultimate boot CD, which looks similar. I ran a number of the disk repair utilities, all of which found and repaired some errors before locking up the pc. This is a very rare occurrence – typically either a drive has a hardware failure and is totally unreadable, or can be at least scanned without too much fuss.
After all the lockups, it was clear that the drive had some hardware issues that I wouldn’t be able to fix. Now it’s just time to get the data off and send it back on warranty.
The mad data grab
Luckily, this was the data disk, and the disk with Windows on it was fine. The problem was Windows would lock up, and I couldn’t even get it to load at that point. I later realized that the drive was hot (I always advocate turning off pcs at night, but this machine usually wasn’t) and that Windows was trying to access it when it was locking up. I popped out the drive and put it in another one, and was able to boot Windows on it fine.
I started up my favorite data recovery software, easy recovery professional. It’s got an easy to use interface, and can recover data from deleted/formatted/corrupted files. I set it to restoring all the useful data to the other drive in the pc, but after a while the machine would lock up. Windows is designed to handle errors – an amazing number of them. Unfortunately when hardware is acting up, it can cause data corruption and unexpected situations for Windows. It doesn’t know what to do when this happens, and will lock up rather than continue working and risk corrupting more data. This is partly why hard locks are a good clue for hardware problems.
After trying a number of other data recovery tools and getting more lock-ups, I finally reached down and felt the drive. It was quite warm, like a drive-through coffee held through one of those cardboard sleeves. I figured time to cool down would help.
The next morning
Knowing the drive was failing, I was under pressure to get the data off before it died completely. Since the lock-ups generally happened at a consistent point, I inferred that most likely one or a few big directories on the drive contained all the errors, and that a good part of the drive should be just fine. I booted up the fixin’ pc again, and restarted easy recovery. I set it to recover the 230 gigs we needed from the drive, checking on it from time to time. After 30 gigs or so, the pc locked up again. Ok, so a piece at a time. After rebooting I realized that the file system on the drive was still intact – it looked like a normal drive in Windows, so I didn’t necessarily have to use an advanced file recovery utility, I could just copy files off it. If you’re surprised that I could be surprised by something so simple, I’m with you. It drives me nuts, but still happens at times.
From there it was a simple matter of setting a bunch of files to copy, and watching to see if they worked. I would get some of the data transferred, then it would lock up. I would then skip the directory it was on when it locked up and went after the next set of directories. As I suspected, there were only a few folders that would lock up the machine when I tried to access them. Finally I got everything off the drive, erased it, and now it’s ready for a warranty return.
Get a backup solution today
I can’t stress enough how important data backups are. Everyone I know would be upset if they lost their hard drive, but those with backups aren’t too worried. There are so many easy options for backing up your data. You could buy an external hard drive, plug it into your USB port and use the backup software it came with, or even better a file synchronizer. Office 2007 has a feature called Groove that can do this, but my favorite is a small program called Save N Sync. It can be set to run when you shut down your computer for the night, and can synchronize all your files. You could also take advantage of the glut of online backup services that are out there. My favorite is Carbonite. You can read my complete review of Carbonite here. Only $50/year for unlimited data backed up from one pc makes it much cheaper than every other option I’ve found, and is perfect for home users. I personally have my important data synchronized on 2 computers, and I use Carbonite to back it up. It’s cheap, it’s easy, and the effort is nothing compared to the downside of losing my important stuff. What’s your stuff worth?
More from Fourth Wave
Latest posts by David Norris (see all)
- Make Your NetSuite Site Builder Site Secure – HTTPS Throughout - May 28, 2017
- An Introduction to Automating XML Sitemaps for NetSuite Companies - November 13, 2016
- An Introduction to NetSuite’s Reference Checkout & My Account Bundles - April 18, 2016
- Are your e-mail templates and scripts ready for the 2016.1 NetSuite upgrade? - February 23, 2016
- NetSuite Site Builder Categories – Bugs and Problems I Learned the Hard Way - November 20, 2015