[wplug] FileSystem problems

Vanco, Donald VANCOD at PIOS.com
Tue Jul 15 14:19:47 EDT 2003


Mike Griffin <mailto:mike at dmrnetworks.com> wrote:
> I don't know if I
> want to stick new drives into this system seeing how both drives went
> "bad" less than 24 hours a part.
	I can understand that.  You can try using Check-it Pro or another
bootable utility to test the IDE HW.  But if you want us to look at the OS
as a possible issue we'll need to know what kernel you're running (and
possibly what version of .... ??  fstool?  fileutils?  the name escapes
me...)  It looks like you're running EXT3.

> The power on the system is the same as it's been, short of sticking an
> oscilloscope in the power socket. 
	I was really referring more to the power supply in the system - they
die frequently too.  I've had several drive issues on HPT IDE controllers
and they were resolved, oddly enough, by going to a much larger PS.  A
voltmeter would help here....

> Perhaps the power supply could be going bad. I've had that happen
> before where it started killing HDDs. hmm, thanks for reminding me!
	:)

Don

> On Tuesday, July 15, 2003, at 01:31  PM, Vanco, Donald wrote:
> 
>> Mike Griffin <mailto:mike at dmrnetworks.com> wrote:
>>> I seem to be having some problems with one of my servers and was
>>> wondering where to start troubleshooting the machine. I would
>>> imagine that it's at the hardware level.
>>> 
>>> A fileserver crashed yesterday with a kernel panic. This machine has
>>> been running for nearly a year, solid. I started getting errors on
>>> the hardrive during writes to the drive, and accessing was very
>>> slow. I ran maxtor utilites on the drive and it found a few
>>> problems that the software fixed. I reinstalled the server OS
>>> (RH7.3) and performed my data recovery from backups. My backups are
>>> stored on the same system but on a different HDD, which gets
>>> mounted with a script everynight and has tarballs written to it, I
>>> also ran fsck -t ext3 on this drive (/dev/hdb1).  I checked for a
>>> new backup this morning, and all was well. I just tried mounting
>>> the drive a few minutes ago and had a ton of bad sector attempt
>>> timeouts saying it cannot find a valid FAT partition. I tried to
>>> run fsck on this partition I get this as a result: 
>>> 
>>> [root at fileserver root]# fsck /dev/hdb1
>>> fsck 1.27 (8-Mar-2002)
>>> e2fsck 1.27 (8-Mar-2002)
>>> fsck.ext2: Attempt to read block from filesystem resulted in short
>>> read while trying to open /dev/hdb1
>>> Could this be a zero-length partition?
>>> [root at fileserver root]# fsck -t ext3 /dev/hdb1
>>> fsck 1.27 (8-Mar-2002)
>>> e2fsck 1.27 (8-Mar-2002)
>>> fsck.ext3: Attempt to read block from filesystem resulted in short
>>> read while trying to open /dev/hdb1
>>> Could this be a zero-length partition?
>>> 
>>> These are two different ATA drives. One is a 20G and one is a 10G. I
>>> thought it was kind of weird that this would happen to both drives
>>> one day apart. possibly a controller problem on the motherboard?
>> 
>> 	Are all your cables (power and data) in good shape and well seated?
>> 	Any chance there's been a change in quality of the power?
>> 	Do BIOS and kernel see the drive geometry in like fashion? (there
>> was a time, circa RH6.1 or .2, that fdisk added an "extra" cylinder
>> to the drive - fun!)
>> 	Do you have a "like system" you can swap the drives into to see how
>> they behave? 
>> 
>> 	IMHO - using IDE HDD tools to "fix" a drive that's reporting media
>> errors is like putting a band-aid on a leper.  Underneath, it's still
>> rotten.  If you've got a drive spitting bad sector errors it's time
>> to $h!tcan the drive and get a new one... MaxTools may delay death,
>> but it's still terminal, and data loss is almost assured.
>> 
>> Don



More information about the wplug mailing list