The definitive answer came first from John Lanier:
The ksh problem is addressed/fixed in 5.1B pk#2 and above.
Here is a description:
=======================
/usr/bin/ksh can use up to 100% CPU Time
A ksh process does not terminate if a user closes a telnet session abruptly (for example,
by using the "X" in the upper right corner of the window). The process continues to run
and can use up to 100% of the CPU on which it is running.
This problem occurs when trap(1) is defined in either a startup script or a script
executed within the current shell process.
Thanks to:
James Sainsbury
Rafael Visser
John Lanier
Bryan Mills
Johan Brusche
Martin Petder
Thomas Rohr Pedersen
Original question:
Dear gurus,
I've got an AlphaServer 4100 running Tru64 5.1B, Patch Kit 1.
Every few days we get a runaway ksh process.  top reports the following:
load averages:  1.94,  2.89,  3.07                                                        19:45:13
124 processes: 8 running, 48 sleeping, 67 idle, 1 zombie
CPU states: 18.3% user,  0.0% nice, 81.6% system,  0.0% idle
Memory: Real: 324M/486M act/tot  Virtual: 1991M use/tot  Free: 12M
  PID USERNAME PRI NICE  SIZE   RES STATE   TIME    CPU COMMAND
452737 root      51    0 2576K  368K run    17.9H 61.30% ksh
448455 root       3  -82   13M  614K sleep 213:33 10.20% dmct
  2138 root       4  -80   11M  466K sleep 605:07  6.00% ccdt
  2322 root      44    0   13M  647K run    45:07  5.70% dtrc
  1830 root       5  -78   13M  548K sleep 475:43  4.80% dprd
340897 root       6  -76   21M  688K sleep   3:00  3.10% dasc
  1589 root      29  -30   12M   10M sleep   9:54  2.10% cmds
 28015 root      10  -68   13M  557K sleep   5:13  1.20% dafc
 50460 root      44    0 4680K 1802K run     0:00  0.60% top
  1444 root      44    0   12M 5758K run     9:18  0.40% Xdec
  2349 root      23  -42 2864K  311K sleep  23:15  0.40% dfb2
  2343 root      23  -42 2864K  311K sleep  18:38  0.30% dfb1
316028 root       7  -74   14M 1351K sleep   6:47  0.10% hmists
448406 root      12  -64   13M  598K sleep   4:36  0.10% dcds
  1906 root      44    0   11M 1449K sleep   4:24  0.10% dtterm
If I try to find the parent, I find the following:
# ps -ef | grep 452737
root      51980   5429  0.0 20:07:49 pts/6        0:00.01 grep 452737
root     452737      1 73.3   Feb 08 pts/11      18:08:01 -ksh (ksh)
Can anyone suggest how I might find the origin of this runaway ksh process?
Thanks,
Jim
Received on Tue Feb 10 2004 - 13:13:29 NZDT