Author Topic: Intermittent boot problem - kernel panic  (Read 1719 times)

Offline peter_pclos

  • Full Member
  • ***
  • Posts: 156
Intermittent boot problem - kernel panic
« on: December 13, 2011, 06:27:53 PM »
I've begun to encounter erratic behaviour with my fairly old desktop machine running a fully upgraded PCLinuxOS.  On searching I find hints that this could possibly be caused by early signs of failure of the battery on the motherboard.  Is this likely?  The machine apparently boots reliably from a live pclos CD, and on one occasion I got it back to life by booting into the Windows option, then restarting the PC out of Windows, but this doesn't always work.

The eventual error message when the Linux boot stalls is "Kernel panic - not syncing: Fatal exception interrupt".

Any suggestions welcome - I just hope the PC is still alive to receive them!

Offline djohnston

  • PCLinuxOS Tester
  • Hero Member
  • *******
  • Posts: 6227
  • I don't do Windows
Re: Intermittent boot problem - kernel panic
« Reply #1 on: December 13, 2011, 08:32:57 PM »
It could be RAM or the boot media. There are other possibilities. What is displayed after "Fatal exception in interrupt"? That might give some clues.
Bare metal                           VBox
AMD Athlon 7750 Dual-Core    Single core
4GiB RAM                              1GiB RAM
nVidia GeForce FX 5200          64MB video
LXDE 32bit                            KDE 64bit

Registered Linux User #416378

Offline peter_pclos

  • Full Member
  • ***
  • Posts: 156
Re: Intermittent boot problem - kernel panic
« Reply #2 on: December 14, 2011, 04:59:23 AM »
In answer to your question - nothing.  That's where it crashes.

For what it's worth, the output immediately before the crash line was:

Quote
EIP: [<c026d323>] ioread16_rep+0x2b/0x3d SS:ESP 0068:c0507dec
CR2: 00000000fffb1000
--- [ end trace ff6525aa4bcccf2c ]---

Intermittent faults are always a pain, but this morning I did a cold boot, but left the GRUB list up for about 30 seconds before selecting PCLinuxOS, and it booted perfectly.  Could this perhaps indicate that the BIOS was sorting itself out - but slowly?

Offline djohnston

  • PCLinuxOS Tester
  • Hero Member
  • *******
  • Posts: 6227
  • I don't do Windows
Re: Intermittent boot problem - kernel panic
« Reply #3 on: December 15, 2011, 10:53:12 AM »
Intermittent faults are always a pain, but this morning I did a cold boot, but left the GRUB list up for about 30 seconds before selecting PCLinuxOS, and it booted perfectly.  Could this perhaps indicate that the BIOS was sorting itself out - but slowly?

I doubt it. Usual culprits are faulty boot medium, faulty RAM module or power supply. Do a net search for "Kernel panic - not syncing: Fatal exception" and you'll see the possibilities. If the MB battery is going, you would lose the BIOS settings, such as the system clock's time and date. If you are shutting down the PC at night, I'd start by running overnight memory tests. Instead of shutting down, reboot from a PCLinuxOS live CD, select MemTest86 from the boot menu, and let it run. That would start a process of elimination.

And, you're right. Intermittent hardware problems can be a pain to track down.
Bare metal                           VBox
AMD Athlon 7750 Dual-Core    Single core
4GiB RAM                              1GiB RAM
nVidia GeForce FX 5200          64MB video
LXDE 32bit                            KDE 64bit

Registered Linux User #416378

Offline pags

  • Hero Member
  • *****
  • Posts: 2519
  • Keep it clean.
Re: Intermittent boot problem - kernel panic
« Reply #4 on: December 15, 2011, 01:40:00 PM »
Runs OK from cold...dies randomly after been running for a while?  Is this a reasonable summary?

Could be cards/chips(RAM) not seated/dirty...heat expansion may mitigate the issue (hence, OK when cold)...

... :(

Warning! Danger, Will Robinson!Complete dis-assemble, clean all contacts (isopropyl - LET IT DRY!, or "red" pencil eraser - MAKE SURE IT'S CLEAN AFTER!) and re-seat all cards, chips, ribbon/cable connectors, etc...

No guarantees, but it's a starting point.

Offline peter_pclos

  • Full Member
  • ***
  • Posts: 156
Re: Intermittent boot problem - kernel panic
« Reply #5 on: December 15, 2011, 06:01:41 PM »
Thanks for the various recent replies.  I've managed to get a bit more diagnostic information which precedes the error messages I reported earlier.  It was copied manually from the monitor from a recent boot crash, but I think it's accurate:

Quote
[<c0104b55>] ? do_IRQ+0x46/0x9a
[<c0103bb0>] ? common_interrupt+0x30/0x38
[<c019136b>] ? page_cache_get_speculative+0xc/0x30
[<c0191cad>] ? find_get_page+0x53/0x7a
[<c0192475>] ? filemap_fault+0x7b/0x343
[<c01a71c3>] ? __do_fault+0x41/0x2f6
[<c01a7765>] ? handle_mm_fault+0x2ed/0x664
[<c03a036d>] ? do_page_fault+0x2b2/0x2c8
[<c03a00bb>] ? do_page_fault+0x0/0x2c8
[<c039e7e3>] ? errorcode+0x73/0x78

I'm not by any stretch of the imagination a systems guru, so if anyone can see anything here which might indicate the source of the intermittent booting problem, I'd be grateful to hear from them.

To clarify what's happening, the symptoms are that it is difficult to achieve the initial boot, but after success (usually involving a delay in selecting the GRUB menu option) the PC is perfectly stable (over several hours) and behaves completely normally.

Since starting this thread, I've become aware of discussion of a similar problem under the thread "Boot process still erratic" on the PCLinuxOS forum at
http://www.pclinuxos.com/forum/index.php?topic=94269.0 and tomorrow will try texstar's suggestions there to see where I get to, but don't let that stop anybody with good ideas from replying to this post!

Offline AS

  • Hero Member
  • *****
  • Posts: 4111
  • Have a nice ... night!
Re: Intermittent boot problem - kernel panic
« Reply #6 on: December 15, 2011, 07:06:53 PM »
Thanks for the various recent replies.  I've managed to get a bit more diagnostic information which precedes the error messages I reported earlier.  It was copied manually from the monitor from a recent boot crash, but I think it's accurate:

Quote
[<c0104b55>] ? do_IRQ+0x46/0x9a
[<c0103bb0>] ? common_interrupt+0x30/0x38
[<c019136b>] ? page_cache_get_speculative+0xc/0x30
[<c0191cad>] ? find_get_page+0x53/0x7a
[<c0192475>] ? filemap_fault+0x7b/0x343
[<c01a71c3>] ? __do_fault+0x41/0x2f6
[<c01a7765>] ? handle_mm_fault+0x2ed/0x664
[<c03a036d>] ? do_page_fault+0x2b2/0x2c8
[<c03a00bb>] ? do_page_fault+0x0/0x2c8
[<c039e7e3>] ? errorcode+0x73/0x78

I'm not by any stretch of the imagination a systems guru, so if anyone can see anything here which might indicate the source of the intermittent booting problem, I'd be grateful to hear from them.


This is part of a "backtrace", roughly the sequence of calls some process made at the time of the crash, it's meant to help the people who developed the code to recognize the exact process flow in case of crash.

Unfortunately it lack of the first part, that usually state the name of the module/process that crashed, but even if present, we are not the developers of the code, most probably some kernel module, so don't worry much about.  :D
The most useful info we can obtain here is the name of the process/module that crashed, just to have an hint about what went wrong.

Also could be interesting to know if the crash happen always the same way ... i.e. always at time of start of plymouth, or at the time of the start of the Xserver  ...

Quote
To clarify what's happening, the symptoms are that it is difficult to achieve the initial boot, but after success (usually involving a delay in selecting the GRUB menu option) the PC is perfectly stable (over several hours) and behaves completely normally.
Since starting this thread, I've become aware of discussion of a similar problem under the thread "Boot process still erratic" on the PCLinuxOS forum at
http://www.pclinuxos.com/forum/index.php?topic=94269.0 and tomorrow will try texstar's suggestions there to see where I get to, but don't let that stop anybody with good ideas from replying to this post!


Additionally there has been reports in the past about crashes (even intermittent like your), for some combination of hardware,
i.e. older video cards, older CPUs, older kernels.

Although you state your system is fully updated, but you don't mention the kernel, which doesn't update automatically.
What's your kernel ? (uname -r from a terminal)

AS

Offline peter_pclos

  • Full Member
  • ***
  • Posts: 156
Re: Intermittent boot problem - kernel panic
« Reply #7 on: December 16, 2011, 11:16:10 AM »
In answer to AS, the kernel is:

2.6.32.24-pclos1.bfs

and I'm running the latest KDE4 from the pclos repository.

Other facts about the hardware:

Motherboard: Abit KT7A (non-RAID)
Processor: Athlon Thunderbird 1.4 GHz
RAM 1GB Crucial DIMM (2x 0.5 GB units)
Graphics card:  Not sure of make, but is AGP (4 Meg I think)

All the above have been performing well for some years; do you reckon this spec. is still adequate?

Unfortunately I haven't had time yet to carry out the tips from the other thread.

Offline djohnston

  • PCLinuxOS Tester
  • Hero Member
  • *******
  • Posts: 6227
  • I don't do Windows
Re: Intermittent boot problem - kernel panic
« Reply #8 on: December 16, 2011, 11:46:23 AM »
In answer to AS, the kernel is:

2.6.32.24-pclos1.bfs

I would have to say, to begin with, update to a newer kernel. That one's fairly old and will definitely cause problems with package updates in the long run. Open Synaptic and mark the 2.6.38.8 bfs kernel for installation. Apply the changes, then reboot immediately after closing Synaptic. The default GRUB option, (the one highlighted), will boot the new kernel. The first boot will take a little longer, due to loading kernel dkms modules. If you don't boot in verbose mode, just press the Esc key when prompted to on the screen in order to see the boot process. The dkms module builds only have to be done once, but should not be interrupted while in progress.

Leave the old kernel installed until you are sure you're satisfied with the new one.
Bare metal                           VBox
AMD Athlon 7750 Dual-Core    Single core
4GiB RAM                              1GiB RAM
nVidia GeForce FX 5200          64MB video
LXDE 32bit                            KDE 64bit

Registered Linux User #416378

Offline DeBaas

  • Hero Member
  • *****
  • Posts: 1522
    • PCLinuxOS.nl
Re: Intermittent boot problem - kernel panic
« Reply #9 on: December 16, 2011, 01:02:31 PM »
Just check your back-up battery. (life span 2-5 years)
On a the first cold boot activate your BIOS screen and check time.
If this is erratick change the battery, mostly a CR2032

After a first boot, or booting windows, time is corrected and the next boot is OK.

Happend to me before ;)

Offline peter_pclos

  • Full Member
  • ***
  • Posts: 156
Re: Intermittent boot problem - kernel panic
« Reply #10 on: December 19, 2011, 04:33:51 PM »
Thanks to all who have contributed to this thread.  I am cautiously optimistic that Texstar's advice in the "Boot process still erratic" thread has worked completely, as after following his instructions I have had four days of problem-free boots.  I suspect, from the circumstances behind that thread, and djohnston's comments here about kernels, that the problem may well have been down to running Synaptic updates on an old kernel.

I'll leave it for a couple of days yet before marking this thread as 'solved', in case my optimism turns out to have been misplaced!

Offline djohnston

  • PCLinuxOS Tester
  • Hero Member
  • *******
  • Posts: 6227
  • I don't do Windows
Re: Intermittent boot problem - kernel panic
« Reply #11 on: December 19, 2011, 08:39:36 PM »
I am cautiously optimistic that Texstar's advice in the "Boot process still erratic" thread has worked completely, as after following his instructions I have had four days of problem-free boots.  I suspect, from the circumstances behind that thread, and djohnston's comments here about kernels, that the problem may well have been down to running Synaptic updates on an old kernel.

Well, Texstar's advice referred to running a filesystem check and turning off speedboot. I'm guessing that's what you've done. Filesystem errors can certainly cause more errors than an older kernel.

In any case, you did some of your own research and problem solving. Nice going!

Bare metal                           VBox
AMD Athlon 7750 Dual-Core    Single core
4GiB RAM                              1GiB RAM
nVidia GeForce FX 5200          64MB video
LXDE 32bit                            KDE 64bit

Registered Linux User #416378

Offline peter_pclos

  • Full Member
  • ***
  • Posts: 156
Re: Intermittent boot problem - kernel panic
« Reply #12 on: December 20, 2011, 04:47:13 AM »
Thanks, djohnston.  But, what caused the filesystem error?  My own experience and the existence of the other thread suggest that it's update related.

BTW, is fsck machine-specific?  i.e. if I fsck a bootable USB SSD, which I use as an external PCLinuxOS OpenBox platform driving a Linutop2, on my main desktop machine, will this cause compatibility problems when I re-attach the SSD to the Linutop?  I ask because my Linutop has also started exhibiting serious boot problems, and I'm not sure I can get PCLinuxOS/OpenBox up to fsck the SSD in situ.

The desktop machine still boots!  If it is still behaving properly tomorrow, I'll mark this thread as solved.

Offline AS

  • Hero Member
  • *****
  • Posts: 4111
  • Have a nice ... night!
Re: Intermittent boot problem - kernel panic
« Reply #13 on: December 20, 2011, 06:11:19 AM »
Thanks, djohnston.  But, what caused the filesystem error?  My own experience and the existence of the other thread suggest that it's update related.

Filesystem inconsistency very probably was due to the unclean shutdown consequent to the kernel - panic.
When a kernel panic occurs, some data (and/or metadata) may still be in cache/buffer (RAM) and are not going to be written on disk(s).

Quote
BTW, is fsck machine-specific?  i.e. if I fsck a bootable USB SSD, which I use as an external PCLinuxOS OpenBox platform driving a Linutop2, on my main desktop machine, will this cause compatibility problems when I re-attach the SSD to the Linutop?  I ask because my Linutop has also started exhibiting serious boot problems, and I'm not sure I can get PCLinuxOS/OpenBox up to fsck the SSD in situ.

No, fsck automatically detect the filesystem type and work accordingly, i.e. it's able to clean up and fix ext2/ext3/ext4 and others ...
In case a filesystem type is not supported, fsck will tell you about and will not perform any operation.

You can safely use the PCLinuxOS fsck to check the Linutop filesystem, you only need to operate fsck on unmounted partitions.

Quote
The desktop machine still boots!  If it is still behaving properly tomorrow, I'll mark this thread as solved.

Good!  ;)

AS

Offline djohnston

  • PCLinuxOS Tester
  • Hero Member
  • *******
  • Posts: 6227
  • I don't do Windows
Re: Intermittent boot problem - kernel panic
« Reply #14 on: December 20, 2011, 09:44:31 AM »

No, fsck automatically detect the filesystem type and work accordingly, i.e. it's able to clean up and fix ext2/ext3/ext4 and others ...
In case a filesystem type is not supported, fsck will tell you about and will not perform any operation.

You can safely use the PCLinuxOS fsck to check the Linutop filesystem, you only need to operate fsck on unmounted partitions.


To add to what as said, the only filesystem I've encountered so far that the fsck -f command doesn't perform any operations on is xfs. Even then, fsck reported which xfs utility to use. And, also heed as's advice and never run a filesystem check on a mounted partition. That's asking for trouble.
Bare metal                           VBox
AMD Athlon 7750 Dual-Core    Single core
4GiB RAM                              1GiB RAM
nVidia GeForce FX 5200          64MB video
LXDE 32bit                            KDE 64bit

Registered Linux User #416378