[wplug] crashh! ext2, fsck, and duplicate blocks
Brandon Kuczenski
brandon at 301south.net
Sun Oct 31 17:33:30 EST 2004
I have a few questions about system administration practices.
I am running a FreeBSD server (okay, okay, should I send it to wplug-bsd?
I think these questions are of general interest, though) and I had a
peculiar problem. A configuration file for one of my scripts got
suddenly and unexpectedly filled with garbage. The garbage looked like
this:
...
t 1099176433 N:N.N.N
t 1099176433 greetings
t 1099176433 HTime-Received:Oct
t 1099176433 H*F:D*com.br
t 1099176433 H*F:D*br
...
When I fixed the file, it soon became garbage again. I disabled the
script and decided to puzzle over it for awhile.
But wait! There's more!
My server just crashed. When it restarted, I fsck'ed the disks and found
"duplicate blocks" -- shared by my Bayes tokens database and that
configuration file. Aha! So my config file got overwritten by spam
data. fsck fixed the problem.
So, given all this, I have three questions: One, how in blazes (I wanted
to say something more R-rated) did this happen? As I said, the OS was
FreeBSD, the filesystem was ext2 (FreeBSD doesn't support ext3, and these
disks were migrated from a Linux system).
Two: should I run fsck on a routine (i.e. cron) basis, to catch glitches
like this? How often do they happen? Or should I just wait for
random reboots to check the disks? What is the "Right thing to do"?
Three: when my server crashes and leaves no helpful information in
/var/log/messages (in fact, even the startup log is missing), am I just
supposed to pretend like nothing happened? How do I find bugs if there
are no logs?
Here's my /var/log/messages surrounding the time of the reboot (the first
several lines are just IP filter data):
Oct 31 14:27:49 ocean ipmon[83]: 14:27:48.985723 rl0 @0:17 b 209.195.143.195,2077 -> 209.195.172.207,3127 PR tcp len 20 48 -S IN
Oct 31 14:28:46 ocean ipmon[83]: 14:28:45.874701 rl0 @0:17 b 209.195.87.230,2728 -> 209.195.172.207,6129 PR tcp len 20 48 -S IN
Oct 31 14:28:49 ocean ipmon[83]: 14:28:48.785276 rl0 @0:17 b 209.195.87.230,2728 -> 209.195.172.207,6129 PR tcp len 20 48 -S IN
Oct 31 14:28:55 ocean ipmon[83]: 14:28:54.896598 rl0 @0:17 b 209.195.87.230,2728 -> 209.195.172.207,6129 PR tcp len 20 48 -S IN
Oct 31 14:34:27 ocean ipmon[83]: 14:34:27.219536 rl0 @0:17 b 63.205.221.242,4325 -> 209.195.172.207,1433 PR tcp len 20 48 -S IN
Oct 31 14:34:31 ocean ipmon[83]: 14:34:30.216077 rl0 @0:17 b 63.205.221.242,4325 -> 209.195.172.207,1433 PR tcp len 20 48 -S IN
Oct 31 17:03:37 ocean /kernel: e
Oct 31 17:03:37 ocean /kernel: dscheck(#ad/0x3000a): b_bcount 1 is not on a sector boundary (ssize 512)
Oct 31 17:03:37 ocean last message repeated 11 times
Oct 31 17:03:37 ocean /kernel: IP Filter: v3.4.31 initialized. Default = pass all, Logging = enabled
Oct 31 17:03:40 ocean ipmon[84]: 17:03:40.513362 rl0 @0:17 b 209.195.138.37,2091 -> 209.195.172.207,3127 PR tcp len 20 48 -S IN
Oct 31 17:03:44 ocean ntpd[116]: ntpd 4.1.0-a Tue May 25 21:15:34 GMT 2004 (1)
Oct 31 17:03:44 ocean ntpd[116]: kernel time discipline status 2040
Oct 31 17:04:02 ocean login: ROOT LOGIN (root) ON ttyv0
Sometime before the line that reads "/kernel: e", the system rebooted. I
mean, WTF?
Little help?
-Brandon
More information about the wplug
mailing list