All...
Hope 6you can help. We have been doing some maintenance on one of our
production 8400's.
Have added some additional KZPBA-CB's (connected to an EMC symmetrix). All
went well. The system booted from genvmunix - lsm starts, all my devices
are visible and mount etc., etc.
Built a new kernel (and config file). Rebooted from new kernel and system
crashes mid boot.
I did see the following displayed during the genvmunix (single user) boot -
but sort of ignored it:
Starting at 0xfffffc000047e9b0
contig_malloc: failed to allocate memory within addrlimit
contig_malloc: failed to allocate memory within addrlimit
contig_malloc: failed to allocate memory within addrlimit
The system came up to single user ok. Started lsm and mounted /usr -
generated a new kernel and booted from it.
Upon booting from the new kernel, everything seems to be proceeding ok
until:
TLMEM at node 7
TLMEM at node 6
TLMEM at node 5
TLMEM at node 4
Dual TLEP at node 3
Dual TLEP at node 2
Dual TLEP at node 1
Dual TLEP at node 0
lvm0: configured.
lvm1: configured.
trap: invalid memory read access from kernel mode
faulting virtual address: 0x0000027b00000005
pc of faulting instruction: 0xfffffc000026d618
ra contents at time of fault: 0xfffffc000026d5d0
sp contents at time of fault: 0xfffffffe9d8df7e0
panic (cpu 0): kernel memory fault
DUMP: No primary swap, no explicit dumpdev.
Nowhere to put header, giving up.
halted CPU 0
halt code = 5
HALT instruction executed
PC = fffffc00004b8130
P00>>>init
This was a consistent problem. However booting multi-user from genvmunix
worked fine. System came up ok - all applications started etc.
It was by this time 01:00am so we were going to leave it running genvmunix
and diagnose further tomorrow.
Only one problem - This system uses HSM software and the application has
near line data on TZ89 based tape silo. The application needs the tape silo
to work. problem is that genvmunix does not seem to have media changer
support.
The system had been up for some 30 minutes or more. We were just wondering
the workaround to this when the system crashed.
trap: invalid memory read access from kernel mode
faulting virtual address: 0x0000043e00000005
pc of faulting instruction: 0xfffffc000026b5e0
ra contents at time of fault: 0x0000000000000168
sp contents at time of fault: 0xfffffffea0c476d0
panic (cpu 1): kernel memory fault
syncing disks...
LSM attempting to dump to SCSI device unit number rz1
DUMP: 27468083 blocks available for dumping.
DUMP: 666546 wanted for a partial compressed dump.
DUMP: Allowing 4843182 of the 4847278 available on 0x800401
DUMP.prom: dev SCSI 0 3 0 1 100 0 0, block 409600
DUMP: Header to 0x800401 at 4847278 (0x49f6ae)
DUMP.prom: dev SCSI 0 3 0 1 100 0 0, block 409600
Looks like to me a hard memory fault of some kind. Any ide how I decide
which memory module may be the faulty one? My config is as follows:
01:03:23 P00>>>show config
01:04:12
01:04:12 Name Type Rev Mnemonic
01:04:12 TLSB
01:04:12 0++ KN7CF-AB 8014 0000 kn7cf-ab0
01:04:12 1++ KN7CF-AB 8014 0000 kn7cf-ab1
01:04:12 2++ KN7CF-AB 8014 0000 kn7cf-ab2
01:04:12 3++ KN7CF-AB 8014 0000 kn7cf-ab3
01:04:12 4+ MS7CC 5000 4000 ms7cc0
01:04:12 5+ MS7CC 5000 4000 ms7cc1
01:04:13 6+ MS7CC 5000 0000 ms7cc2
01:04:13 7+ MS7CC 5000 0000 ms7cc3
01:04:13 8+ KFTHA 2000 0D03 kftha0
01:04:13
01:16:14 P00>>>sho mem
01:17:58 Set Node Size Base Address Intlv Position
01:17:59 --- ---- ---- -------- -------- ----- --------
01:17:59 A 4 4096 Mb 00000000 00000000 8-Way 0
01:17:59 A 5 4096 Mb 00000000 00000000 8-Way 1
01:17:59 B 6 2048 Mb 00000002 00000000 4-Way 0
01:17:59 B 7 2048 Mb 00000002 00000000 4-Way 1
01:17:59 P00>>>
Any help would be greatly appreciated.
Best regards - Tony
Received on Fri Feb 18 2000 - 01:35:55 NZDT