Dear managers,
Our department has a AlphaServer 4100 computer. We had upgrade to DecUnix 4.0F almost one month ago. But in the recent weeks,
many users complain the system is too slow. By checking the system, I find it lost some CPU resource. This situation appear more
than ten times. Below are the tracing log on one of them.
In the "System Info" panel, I find the Activity reach 100%.
(Embedded image moved to file: pic05095.pcx)
But in the "CPU Info" panel, I find most of CPU resouce are used by System, only a few for User.
(Embedded image moved to file: pic09306.pcx) (Embedded image moved to file: pic18826.pcx)
Using the "vmstat" can get the same result.....
> vmstat 1
Virtual Memory Statistics: (pagesize = 8192)
procs memory pages intr cpu
r w u act free wire fault cow zero react pin pout in sy cs us sy id
14 343 31 337K 118K 58K 3559M 947M 716M 476K 793M 71K 165 12K 2K 47 18 36
14 343 31 338K 117K 58K 19K 5230 4290 0 4155 0 70 18K 2K 24 76 0
13 344 31 338K 118K 58K 18K 4911 3939 0 3878 0 224 22K 2K 20 80 0
13 345 31 338K 117K 58K 39K 10K 8367 0 8516 0 44 22K 2K 20 80 0
15 342 31 338K 117K 58K 19K 5195 4140 0 4105 0 91 20K 2K 19 81 0
14 343 31 338K 117K 58K 18K 4973 3983 0 3941 0 104 20K 2K 26 74 0
I want to find out which process occupy the CPU resource on system level. But I immediately known I can not find out it by
checking the %CPU on the output of "ps" command. Because it only show the user level.
After checking process one by one, I find one of our AP was in abnormal situation. It was hung due to FTP failure . I
killed the Parent process of below process. The parent Parent process is hang for many hours.
ebopr1 24626 0.0 0.0 1.78M 160K ?? R 15:01:57 0:00.01 sh -c ec
ho `date '+%m/%d %H:%M:%S'` rgt_81459908230_160a.ftp:Broken pipe, disk full or transmission failed >> /pd1/tmp/rgtftp.log
My questions are :
1. Why the abnormal process will not be killed after FTP failure for many hours ?
2. Is it related to the upgrade from 4.0E to 4.0F?
3. Is there any easy way to show which process occupy the CPU resource on system level, so I can quickly kill it ?
Best regards,
JFYeh(jfyeh_at_tsmc.com.tw)
Received on Fri Sep 10 1999 - 03:10:51 NZST