Author Topic: Unexpected System Instant Power Off / Crash without warning <SOLVED>  (Read 2465 times)

Offline HWDude

  • Full Member
  • ***
  • Posts: 193
  • HWDude
    • N8OBJ's Home Page
I have an AMD Athlon 64 x2 Dual Core 5800+ Socket AM2 CPU with 4 GB  Dual Channel DDR2-800, 128 bit RAM,  Gigabyte GA-M61PME-S2P motherboard, 500W power supply, NVIDIA GeFORCE 7600 GT Video card w/256Mb DDR2 memory.  I'm running kernel 2.6.26.8.tex3 and I am up to date on all system updates from the repo's.

The problem I'm having is that at totally random intervals (anywhere from several minutes to > 6 hours), the system shuts off instantly (like you just yanked the plug from the wall). No warning, no shutdown message, not even a beep or error window.  Just BAM! - and you then get the no video signal from the monitor.  I press the power button and it boots up again - the file systems fix themselves from the lost file fragments that were left hanging by the system crash and everything is back to "normal".

I removed a PCI-express 1394 firewire card that I had in the system and it seemed at first to correct the problem, but it just did it to me again without it.  I have also found that if I turn on the GKx86info plug-in option (this displays the CPU clock frequency) under the GKRellM monitor it seems to aggravate the problem.

I've replaced the power supply (had a 450Watt one previously) with a good 500W supply.  Still have the random system shutdowns.
The motherboard BIOS is up to date as well.  All the RAM modules in the system are the same make and speed.

This is the only system (of 8 or so that I'm running this PCLOS on) that has this problem, so I expect something hardware related.

Any ideas?  CPU?  Motherboard?

HWDude
« Last Edit: February 07, 2010, 06:54:17 PM by HWDude »
Welcome My Son -- Welcome -- To -- The Machine ; Pink Floyd "Wish You Were Here"


Online Rudge

  • Hero Member
  • *****
  • Posts: 9683
  • I'm Just A Dog.
Re: Unexpected System Instant Power Off / Crash without warning
« Reply #1 on: February 03, 2010, 08:29:23 PM »
The last time I saw this, I had to replace the mother board. - Same CPU  >:( >:(

BUT,, a co-worker took that same mother board and put a new CPU on it, and has been using it since so,, maybe it had something to do with the CPU seating.

Yank it off and reapply a bunch of heat sink gooy stuff and try that.  ??? ???
« Last Edit: February 03, 2010, 08:31:13 PM by rudge »


-If you wish to make an apple pie from scratch, you must first invent the universe-  Carl Sagan

Offline HWDude

  • Full Member
  • ***
  • Posts: 193
  • HWDude
    • N8OBJ's Home Page
Re: Unexpected System Instant Power Off / Crash without warning
« Reply #2 on: February 03, 2010, 08:50:17 PM »
The last time I saw this, I had to replace the mother board. - Same CPU  >:( >:(

BUT,, a co-worker took that same mother board and put a new CPU on it, and has been using it since so,, maybe it had something to do with the CPU seating.

Yank it off and reapply a bunch of heat sink gooy stuff and try that.  ??? ???

I just checked the system health page of the BIOS boot page and the CPU is running at 32 Deg C with the auto fan control on.  I just shut off the auto control and am now running the CPU fan at full speed all the time now to see if it drops the CPU temp at all.

ARGH!


HWDude
Welcome My Son -- Welcome -- To -- The Machine ; Pink Floyd "Wish You Were Here"


Online Rudge

  • Hero Member
  • *****
  • Posts: 9683
  • I'm Just A Dog.
Re: Unexpected System Instant Power Off / Crash without warning
« Reply #3 on: February 03, 2010, 08:58:22 PM »
Let me know. It is just a guess.   Good luck!


-If you wish to make an apple pie from scratch, you must first invent the universe-  Carl Sagan

Offline Village Idiot

  • Hero Member
  • *****
  • Posts: 2345
  • Have A Nice Day.
Re: Unexpected System Instant Power Off / Crash without warning
« Reply #4 on: February 03, 2010, 08:58:40 PM »


I just checked the system health page of the BIOS boot page and the CPU is running at 32 Deg C with the auto fan control on.  I just shut off the auto control and am now running the CPU fan at full speed all the time now to see if it drops the CPU temp at all.

ARGH!


HWDude


I would take the cpu temp in the bios with a grain of salt because the bios program doesn't really 'work' the cpu. I'd suggest, if you can, running a temp checker from linux after it boots and see what temperature it is. Look for wayward processes running also... Good luck.  :)
$ fortune
No Microsoft products were used in any way for the creation of this message.
If you are using a Microsoft product to view it, BEWARE! - I'm not
responsible for any harm you might encounter as a result.

Offline HWDude

  • Full Member
  • ***
  • Posts: 193
  • HWDude
    • N8OBJ's Home Page
Re: Unexpected System Instant Power Off / Crash without warning
« Reply #5 on: February 03, 2010, 09:09:47 PM »
I just checked the BIOS version again and discovered that I'm NOT running the latest bios - time to upgrade...

I also checked the CPU temp again after I shut off the auto fan control and the fan RPM went from 900 to 2600 and the CPU temp dropped to 29C.

I also shut off the AMD Cool N Quiet BIOS option to see if this helps.

Will upgrade BIOS and continue to monitor ....

HWDude
Welcome My Son -- Welcome -- To -- The Machine ; Pink Floyd "Wish You Were Here"


Offline Village Idiot

  • Hero Member
  • *****
  • Posts: 2345
  • Have A Nice Day.
Re: Unexpected System Instant Power Off / Crash without warning
« Reply #6 on: February 03, 2010, 09:13:30 PM »
Dude! get yer system stable before you do a bios update. If it dies while updating, well you know...

$ fortune
No Microsoft products were used in any way for the creation of this message.
If you are using a Microsoft product to view it, BEWARE! - I'm not
responsible for any harm you might encounter as a result.

Online Rudge

  • Hero Member
  • *****
  • Posts: 9683
  • I'm Just A Dog.
Re: Unexpected System Instant Power Off / Crash without warning
« Reply #7 on: February 03, 2010, 09:19:08 PM »
Dude! get yer system stable before you do a bios update. If it dies while updating, well you know...



+1


-If you wish to make an apple pie from scratch, you must first invent the universe-  Carl Sagan

Offline ThirdOfSix

  • Hero Member
  • *****
  • Posts: 745
Re: Unexpected System Instant Power Off / Crash without warning
« Reply #8 on: February 03, 2010, 10:50:51 PM »
HWDude,

I too think the most likely cause would be the CPU overheating or, heaven forbid, one that has been severely overheated in the past and now has intermittent problems.

But you have to take everything into consideration.

I do not know whether you are in a private home or at work or what .

It is possible that you are having an intermittent power drop on the circuit that your computer is plugged into.

This could be caused by a bad connection somewhere along with someone then starting a large heater, hair drier, air conditioner etc. from that same circuit.

On the off chance that the "HW" in HWDude stands for hardware, you might try rigging up a holding relay circuit to the outlet that your computer is on and see if the relay has dropped out after one of these events.

I know this is not too likely to be the cause but, the first rule of troubleshooting any electrical device is to check out the power source.

If nothing else, try to borrow an uninterruptable power supply and run your system from it for a while. But, if you do, be sure that it is a true uninterruptable supply. Many of the devices sold as such really only kick in their internal inverter after the power has been down for a bit. Normally this is fast enough to keep a system up. But it is not fast enough when doing this kind of troubleshooting.



Offline HWDude

  • Full Member
  • ***
  • Posts: 193
  • HWDude
    • N8OBJ's Home Page
Re: Unexpected System Instant Power Off / Crash without warning
« Reply #9 on: February 04, 2010, 10:56:47 AM »
HWDude,

I too think the most likely cause would be the CPU overheating or, heaven forbid, one that has been severely overheated in the past and now has intermittent problems.

But you have to take everything into consideration.

I do not know whether you are in a private home or at work or what .

It is possible that you are having an intermittent power drop on the circuit that your computer is plugged into.

This could be caused by a bad connection somewhere along with someone then starting a large heater, hair drier, air conditioner etc. from that same circuit.

On the off chance that the "HW" in HWDude stands for hardware, you might try rigging up a holding relay circuit to the outlet that your computer is on and see if the relay has dropped out after one of these events.

I know this is not too likely to be the cause but, the first rule of troubleshooting any electrical device is to check out the power source.

If nothing else, try to borrow an uninterruptable power supply and run your system from it for a while. But, if you do, be sure that it is a true uninterruptable supply. Many of the devices sold as such really only kick in their internal inverter after the power has been down for a bit. Normally this is fast enough to keep a system up. But it is not fast enough when doing this kind of troubleshooting.





This system is for home use and is in my basement that sits pretty much at 65F most of the time.  And yes, HWDude is for Hardware Dude (I'm an Electrical [analog] design engineer).

Well, I bought the CPU / motherboard / memory / power supply/ video card new and on the CPU I used the silver-based thermal compound (with an excellent thermal transfer performance) between it and the large "butterfly" heatsink so I'm fairly confident the CPU has never seen its core go above 35C.  The system is already on a 750W UPS and it doesn't chirp (as it normally does during a power glitch or dip) when the system shuts off.  The power feed (which I put in many years back myself) only has another UPS on it running my GPS disciplined 10Mhz freq standard (which pulls 42 watts - not much of a load) and an ACER Aspire One netbook which runs 24/7 and holds (among other things) the copy of a PCLOS repo for all systems to have local access to.  These other systems were running flawlessly during each of my unexpected shutdowns. The UPS is plugged into a surge suppressor as well so I don't think power conditioning is related to this problem.

I did end up upgrading the BIOS (you do it in DOS mode, so pretty low draw on the resouces of the motherboard but yes, it could of crashed and burned....).  I also noticed that the power off crash never happens when you are in the BIOS setup screen (I ran it for 48 hours straight in the setup screen with no problems just to check it).

The other interesting note is that this system is a dual boot and this problem has never shown up running WINXP home edition (which has run for 4 days straight with no shutdowns).

HWDude
Welcome My Son -- Welcome -- To -- The Machine ; Pink Floyd "Wish You Were Here"


Online pags

  • Hero Member
  • *****
  • Posts: 2515
  • Keep it clean.
Re: Unexpected System Instant Power Off / Crash without warning
« Reply #10 on: February 04, 2010, 11:10:08 AM »
Just out of curiosity...have you confirmed that the system is shutting down (and not just losing video)?

Sounds like you have additional PCs available.  Try enabling sshd on the problem system, and when it "dies", see if you can ssh in from another PC on the network.

I only mention this because I had issues that I misdiagnosed.

On one, the display would flicker and die at random intervals.  I thought it was an Xorg/system problem, but I now suspect it is the LCD monitor only, and the system remains running without errors when this occurs.  Until I can correct/replace (soemday), that system remains running in a "server" capacity.
On the other (the laptop I'm using), there are definitely Xorg issues (with Intel 945).  In this case, a bunch of errors are occuring in the Xorg log, and resetting the display manager doesn't help.  However, I am able to ssh into the machine (from another PC beside me), and do a proper reboot.  I'll wait for the 2010 ISO (new xorg, kernel, etc) and see if it corrects the problem.  Curiously, this only manifests itself while using an external 19" LCD (running undocked, or on the 22" wide LCD at my other office is OK, so far, touch wood).

Maybe this can give you some additional avenues to explore that don't entail much additional cost...

Offline HWDude

  • Full Member
  • ***
  • Posts: 193
  • HWDude
    • N8OBJ's Home Page
Re: Unexpected System Instant Power Off / Crash without warning
« Reply #11 on: February 04, 2010, 07:38:46 PM »
Just out of curiosity...have you confirmed that the system is shutting down (and not just losing video)?

Sounds like you have additional PCs available.  Try enabling sshd on the problem system, and when it "dies", see if you can ssh in from another PC on the network.

I only mention this because I had issues that I misdiagnosed.

On one, the display would flicker and die at random intervals.  I thought it was an Xorg/system problem, but I now suspect it is the LCD monitor only, and the system remains running without errors when this occurs.  Until I can correct/replace (soemday), that system remains running in a "server" capacity.
On the other (the laptop I'm using), there are definitely Xorg issues (with Intel 945).  In this case, a bunch of errors are occuring in the Xorg log, and resetting the display manager doesn't help.  However, I am able to ssh into the machine (from another PC beside me), and do a proper reboot.  I'll wait for the 2010 ISO (new xorg, kernel, etc) and see if it corrects the problem.  Curiously, this only manifests itself while using an external 19" LCD (running undocked, or on the 22" wide LCD at my other office is OK, so far, touch wood).

Maybe this can give you some additional avenues to explore that don't entail much additional cost...

Yes, the power goes off completely as I'm sitting next to the case and the fans (with LED's in them) all shut off, the power LED goes off and the whole box & power supply go silent.  Pressing the power switch brings it back to life (fans / leds all turn on again). 

I've made it 3 hours so far with the BIOS upgrade - keeping my fingers crossed!  This system will hopefully still be running come Monday morning (I'm planning on leaving it on as long as it will stay on for "testing").

This one feels like a firmware/software bug that when the timing lines up just right causes an instant shutdown.  Hoping the BIOS upgrade kills this bug...

Will keep you posted on the outcome (no news now is good news!)

HWDude
Welcome My Son -- Welcome -- To -- The Machine ; Pink Floyd "Wish You Were Here"


Offline HWDude

  • Full Member
  • ***
  • Posts: 193
  • HWDude
    • N8OBJ's Home Page
Re: Unexpected System Instant Power Off / Crash without warning <SOLVED>
« Reply #12 on: February 07, 2010, 06:53:47 PM »
After the motherboard BIOS update, I shut off three things in the BIOS setup I was suspect of:

1 - Hardware Virtualization
2 - AMD Cool n Quiet
3 - Auto Fan Control

I then ran all the apps that used to crash it (including the GKRELL CPU Speed meter)  for 74 hours (3 days, 2 hours) with no problems.

I just rebooted the system, turned on just Hardware Virtualization, re-booted and the system ran for 4 minutes and then shut itself off completely. BAM!

So I went back into the BIOS, shut off Hardware Virtualization, and here I am typing.  Since the CPU is running at a constant 29C, I'm not going to
play around with the Cool n Quiet  or Fan control.  Just going to run again for 48+ hours with the virtualization off to see if this was a fluke or I really
figured this out.   Will post again when I have more news, but it definitely appears that Hardware Virtualization in conjunction with the Krell CPU speed
indicator causes the system to shut off without warning!

HWDude
Welcome My Son -- Welcome -- To -- The Machine ; Pink Floyd "Wish You Were Here"


Offline ThirdOfSix

  • Hero Member
  • *****
  • Posts: 745
Re: Unexpected System Instant Power Off / Crash without warning <SOLVED>
« Reply #13 on: February 07, 2010, 08:19:48 PM »
HWDude,

Thanks for the update.

You have probably just saved someone else a whole lot of grief in the future.

As you may have guessed, I find it very annoying when someone writes "solved" and then doesn't tell us how they solved it.


Offline coolbreeze

  • Hero Member
  • *****
  • Posts: 2666
  • Error #152 - Windows not found: (C)heer (P)arty (D
Re: Unexpected System Instant Power Off / Crash without warning <SOLVED>
« Reply #14 on: February 08, 2010, 08:50:41 AM »
I find it very annoying when someone writes "solved" and then doesn't tell us how they solved it.


+100% agreement
Linux user #440309
PCLOS IS THE KING  Please Donate to the cause. PCLinuxOS
My mind is so full that there's no room to think.
M5A78L-M mobo, AMD Phenom IIx 6 1055T, 4Gig ram,nvidia GeForce GT 240 1 Gig, Netgear DGN2200