Dear managers,
     Our department has a AlphaServer 4100 computer. We had upgrade to DecUnix 4.0F almost one month ago. But in the recent weeks,
many users complain the system is too slow. By checking the system, I find it lost some CPU resource. This situation appear more
than ten times. Below are the tracing log on one of them.
     In the "System Info" panel,  I find the Activity reach 100%.
(Embedded image moved to file: pic05095.pcx)
     But in the "CPU Info" panel, I find most of CPU resouce are used by System, only a few for User.
(Embedded image moved to file: pic09306.pcx)  (Embedded image moved to file: pic18826.pcx)
     Using the "vmstat" can get the same result.....
> vmstat 1
Virtual Memory Statistics: (pagesize = 8192)
  procs    memory         pages                          intr        cpu
  r  w  u  act  free wire fault cow zero react pin pout  in  sy  cs  us  sy  id
 14 343 31  337K 118K  58K 3559M 947M 716M 476K 793M  71K 165 12K  2K  47  18  36
 14 343 31  338K 117K  58K  19K 5230 4290    0 4155    0  70 18K  2K  24  76   0
 13 344 31  338K 118K  58K  18K 4911 3939    0 3878    0 224 22K  2K  20  80   0
 13 345 31  338K 117K  58K  39K  10K 8367    0 8516    0  44 22K  2K  20  80   0
 15 342 31  338K 117K  58K  19K 5195 4140    0 4105    0  91 20K  2K  19  81   0
 14 343 31  338K 117K  58K  18K 4973 3983    0 3941    0 104 20K  2K  26  74   0
     I want to find out which process occupy the CPU  resource on system level. But I immediately known I can not find out it by
checking the %CPU on the output of "ps" command. Because it only show the user level.
     After checking process one by one, I find one of our AP was in abnormal situation. It was hung due to  FTP failure .      I
killed the Parent process of below process. The parent Parent process is hang for many hours.
ebopr1    24626  0.0  0.0 1.78M 160K ??       R    15:01:57     0:00.01 sh -c ec
ho `date '+%m/%d %H:%M:%S'` rgt_81459908230_160a.ftp:Broken pipe, disk full or transmission failed >> /pd1/tmp/rgtftp.log
     My questions are :
     1. Why the abnormal process will not be killed after FTP failure for many hours ?
     2. Is it related to the upgrade from 4.0E to 4.0F?
     3. Is there any easy way to show which process occupy the CPU resource on system level, so I can quickly kill it ?
Best regards,
JFYeh(jfyeh_at_tsmc.com.tw)
Received on Fri Sep 10 1999 - 03:10:51 NZST