[wplug] Another OT buggy-hardware post

Bryon Gill bgtrio at yahoo.com
Fri Mar 17 13:21:07 EST 2006


This probably won't solve your problem, but I thought it was worth sharing 
anyway.  I'm running FC4 on a reasonably modern desktop.  I had this issue where 
after a reboot my computer would run slowly for a couple days, then 
mysteriously go back to regular speed eventually with no obvious pattern to 
when things would return to normal.  The desktop was laggy, and 3d games were 
unplayable.  I thought it could have been a whole bunch of different things; 
the video card/drivers, memory, bad order of pci cards, etc.  Turns out it was 
temperature related; ACPI was switching modes to active cooling whenever the 
CPU reached a certain (too low) threshhold and not turning it back.

After a painful slog through the documentation I came up with this incantation 
to make the threshholds more reasonable:

echo -n "60:0:59:45:40" > /proc/acpi/thermal_zone/THRM/trip_points

I don't have time now to explain what it all means, just wanted to throw it out 
there for people to file in the back of their heads under "possible avenues to 
explore when diagnosing unexpected machine slowdowns."

Bryon


On Fri, 17 Mar 2006, Brandon Kuczenski wrote:

> On Fri, 17 Mar 2006, Drew from Zhrodague wrote:
>
>>> Thanks for everyone's suggestions, and thanks especially to those people 
>>> who can accomodate for the fact that I don't seem to be able to read
>>> (Chet) or Google (Jonathan) properly..
>>> 
>>> The power supply is a new premium low-noise one which is more recent than 
>>> the crashing started happening, so I think it's okay.
>>> 
>>> I dd'ed memtest onto a floppy and ran it overnight at my underclocked 
>>> speed (1250 MHz).  It ran for almost 8 hours and reported no errors.  I 
>>> rebooted at the proper clockspeed (1666 MHz) and was about to leave it to 
>>> run all day when my computer spontaneously shutdown between 20 and 25 
>>> minutes in.
>>> 
>>> Incidentally, memtest reports the processor as a Duron, not an Athlon XP. 
>>> /proc/cpuinfo reports:
>>> processor       : 0
>>> vendor_id       : AuthenticAMD
>>> cpu family      : 6
>>> model           : 8
>>> model name      : AMD Athlon(TM) XP 2000+
>>> stepping        : 0
>>> cpu MHz         : 1222.637
>>> ...
>>> 
>>> My conclusions: heat and the video card are clearly not the problems.  I 
>>> think the memory itself is ok.  There are no jumper settings on the mobo 
>>> for either CPU or Memory voltage -- just about the only user-configrable 
>>> settings are the clock multipliers -- so the problem must be the mobo or 
>>> the cpu. Based on comments in this thread, I'm going to buy myself a new 
>>> processor.
>>
>> 	Are you *sure* that heat is not a problem? CPUs generally get hotter 
>> under use. As an example, my Athlon 1.4Ghz workstation here has a clogged 
>> heatsink, and doesn't cool very well. I can sorta lightl use the machine 
>> (nfs host), but if I fire-up my distributed.net client, the thing will shit 
>> itself pretty quickly. Ditto if I do anything that is more processor 
>> intensive than just sitting there sharing its disk.
>>
>> 	I have a brand new heatsink/fan combo ready to go, and I'll bet that 
>> problem will be solved by swapping them. I'll post my results.
>> 
>
>
> I thought it was thermal for a long time.  But the more I learn, the more I 
> suspect a defect.
>
> It seems like there are "crash-prone" cpu-intensive tasks and "safe" 
> cpu-intensive tasks.  sensors(1) never reports a CPU temperature higher than 
> about 51.5 C, even when the computer is working fairly hard... but like I 
> said, there are particular running conditions (which I assume must exercise 
> different parts of the processor) which are basically guaranteed to cause a 
> crash.  Running lame(1) will do it.  Certain XScreensaver hacks will do it 
> (XLyap comes to mind), but most won't, even CPU-heavy ones.
>
> I just ran fireworkx(1), one of the XScreensaver hacks that relies on 
> hardware graphics acceleration that my AGP card doesn't support, for 12 
> minutes:
>
> b at plaza:/tmp$ uptime
> 11:38:22 up  1:43,  9 users,  load average: 0.98, 1.05, 0.72
>
> and sensors reports the CPU temperature is 50C and the M/B is 43C.  Those 
> seem pretty comfortable based on what I've read and seen.
>
> Of course, then I allowed the computer to rest (CPU temp came down to 47C and 
> stayed there) and ran 'lame' for about 5 minutes straight, which has reliably 
> caused crashes in the past, and the computer was fine.  So who knows?  THe 
> only thing that is absolutely reliable in generating crashes is running at 
> the rated clockspeed... and my training has always suggested that you "fix 
> the obvious problems and see if the more pernicious ones 'just go away,' 
> because often they do."
>
> If this hasn't convinced you that temperature is not an issue, let me know. 
> I'd love to hear your thoughts.  Nobody is selling Athlon XP 2000+s 
> anymore...
>
> Can I run a processor with a 333MHz FSB on a motherboard with a 266 MHz FSB?
>
> -Brandon
>
> _______________________________________________
> wplug mailing list
> wplug at wplug.org
> http://www.wplug.org/mailman/listinfo/wplug
>


More information about the wplug mailing list