help14 Feb 2005
Okay, so, I am at my wits’ end. I need some computer hardware help. Details below:
Okay, the sequence of my troubleshooting so far is roughly this:
- I start getting random freezes on my PCs, with increasing frequency after a day goes by. Nothing in the logs or any obvious problem. Just a hard freeze. I figure this is probably a memory problem, so I run memtest. Sure enough, I get tons of errors.
- I buy a new DIMM from Compusa and plug it in. I run memtest again and I’m still getting errors. Okay, so must be a bad motherboard.
- I order a new mobo/CPU combo. Biostar mobo w/ AMD Athlon XP 2400
- I swap the new PC133 512M DIMM I bought for the old motherboard for a new DDR 512M DIMM for the new motherboard.
I put it all together, assuming all my problems are over, so I didn’t think to run memtest again. I boot up and .. oops. kernel panic trying to mount the root filesystem with SCSI errors like below:
> scsi1: ERROR on channel 0, id 0, lun 0, CDB: 0x28 00 00 00 10 9f 00 00 08 00 Info fld=0x109f, Current sd08:01: sns = f0 3 ASC=11 ASCQ= 0
> Raw sense data:0xf0 0x00 0x03 0x00 0x00 0x10 0x9f 0x0a 0x10 0x1f 0x00 0x02 0x11 0x00 0x00 0x80 0x00 0xa0 I/O Error: dev 08:01, sector 4192
- I assume at this point that I screwed something up while rebuilding the PC around the new motherboard. I boot up Knoppix on the PC (runs on a ramdisk) and I get the same errors trying to mount either drive. I swap the drives around on the cable. I try them one at a time. Nothing helps. Same errors.
- I buy a new SCSI cable (after scouring the city for a u2w SCSI cable), thinking that the cable is the most obvious faulty part. No change – same errors.
- Starting to get a little frustrated at this point. So, I figure maybe it’s some incompatibility between my SCSI card (Adaptec AHA-2940U2W) and my motherboard. So I rebuild everything around my old motherboard outside the case and boot it up. Same SCSI errors. So, scratch that theory. Something is definitely broken that wasn’t before.
- So I put everything back together around the new motherboard and boot up Knoppix again. Same SCSI error – hey, wait a minute. The Adaptec card isn’t even in the PC. This error was something to do with the CDROM (which in Knoppix is utilized via SCSI emulation). WTF? So, I am starting to think there’s some problem with the power supply – the only common element. So, I go get a new power supply, plug it in .. same problems. Okay, scratch that theory.
- So I’ve basically given up, and I’m running Knoppix to browse the web and check my e-mail. I notice performance isn’t great – firefox and mozilla keep segfaulting. ssh even segfaults once in a while, and performance eventually degrades until it locks up. Keep in mind this is on the brand new motherboard/CPU/RAM, with no hard drive or controller even hooked up.
- So, since I’m seeing this terrible performance,I run memtest again on all this new hardware. Sure as shit, I am still seeing tons of errors.
- I realized there’s one more common element between the two – my video card. Is is possible an AGP card could cause all these problems? Well, only one way to find out. I swapped the card with an older AGP card I had laying around. Rebooted, ran memtest. Same errors.
So, that’s where I stand. I am at a complete loss. I’ve never been so frustrated in my life. The SCSI and RAM problems could be unrelated or not, I don’t know at this point. I am going to try buying a new SCSI card tomorrow.
My only remaining theory is that maybe my old power supply was bad – bad enough to fry good hardware, resulting in the exact same problems with my new hardware that I had with the old hardware. The downside is that requires returning and replacing all the new hardware I’ve gotten and trying it with the new power supply, which is a time-consuming process, if mwave will even take my mobo/CPU back.
UPDATE: Okay, I replaced the SCSI controller. No dice. Tested the RAM again with the new power supply in. Still testing bad.
Bad SCSI card
- Both SCSI drives are bad AND new motherboard or RAM is bad by chance
- Bad power supply caused both problems with the old motherboard/RAM/CPU and is bad enough to fry the new motherboard/RAM/CPU in .. exactly the same way (?!)
Any theories I am forgetting?