Greetings. We have an es40 running T64 v5.1a. Periodically it panics
with a "kernel memory fault", and output like the following is logged in
the messages file:
Jul 18 00:16:51 vmunix: trap: invalid memory read access from kernel
mode
Jul 18 00:16:51 vmunix:     faulting virtual address:
0x0ffffc01ee31aa60
Jul 18 00:16:51 vmunix:     pc of faulting instruction:
0xfffffc000070c2cc
Jul 18 00:16:51 vmunix:     ra contents at time of fault:
0xfffffc000070b7d4
Jul 18 00:16:51 vmunix:     sp contents at time of fault:
0xfffffe068a47f1a0
Jul 18 00:16:51 vmunix: panic (cpu 0): kernel memory fault 
The pc and ra are always the same. I used dbx to find the source of the
problem, and got this:
(dbx) 0xfffffc000070c2cc/i
  [tu_receive_int:5049, 0xfffffc000070c2cc]     ldl     t0, 0(s1)
(dbx) 0xfffffc000070b7d4/i
  [tuintr:4506, 0xfffffc000070b7d4]     ldq_u   zero, 0(sp)
which seems to implicate a NIC. A look at interfaces shows that one of
them has lots of input errors, while others have none:
Name  Mtu   Network   Address           Ipkts  Ierrs  Opkts   Oerrs
Coll
tu1   1500  <Link>    00:06:2b:00:2d:79 633559  2131  1794775    11
0
tu1   1500  <ip>      <hostname>        633559  2131  1794775    11
0
and a look at "netstat -s -I tu1" shows:
            2131 receive failures, reasons include:
                      1160 frame check sequence errors
                       971 frame error
What is the next step in debugging this? Should I be talking to the
people who manage the switch the server is attached to? Should we look
at cable lengths? Am I chasing a red herring here? 
TIA for thoughts and guidance!
Judith Reed
Received on Wed Jul 19 2006 - 20:15:40 NZST