I've got a two member cluster running V5.1A -- two DS10's connected via
memory channel. The other night, one member crashed and rebooted with
this message in the log:
vmunix: panic (cpu 0): ics_unable_to_make_progress: input thread stalled
vmunix: syncing disks.
Now and then I see that the output of the ps command produces some
garbage:
# ps
PID TTY S TIME CMD
1316826 console ???+ ?? /usr/sbin/getty console console vt100
1462001 pts/1 ???+ ?? -sh (sh)
1473837 pts/1 ???+ ?? ps
1451713 pts/4 ???+ ?? /usr/local/admin/robodump/robodump -0 -f
keck1
1453136 pts/4 ???+ ?? /usr/local/admin/robodump/robodomo -m
dumpmast
1460300 pts/4 ???+ ?? -sh (sh)
1475184 pts/4 ???+ ?? /sbin/vdump -0 -u -b 64 -f /dev/nrmt3h
-U /kec
1475330 pts/4 ???+ ?? /usr/local/admin/robodump/podump -0 -e 0
-f ke
Why the question marks for the S and TIME columns ? The other member
shows:
# ps
PID TTY S TIME CMD
525908 console I + 0:00.01 /usr/sbin/getty console console vt100
598401 pts/0 S + 0:00.04 -sh (sh)
598866 pts/0 R + 0:00.02 ps
I notice that the member with the messed up ps output has a high memory
usage and vmstat -P indicates that most of the memory is in malloc pages.
Anyone have any idea what the kernel panic means or why the ps output is
messed up ? The system is basically running mail (sendmail), web
(apache) and samba services.
Dirk Kleinhesselink
System and Network Administrator
Keck Center for Integrative Neuroscience
UCSF
Received on Wed Jun 19 2002 - 18:54:16 NZST