SUMMARY: Runaway processes?

From: Dan Kirkpatrick <dkirk_at_suhep.phy.syr.edu>
Date: Thu, 09 Apr 1998 09:44:13 -0400 (EDT)

Thanks to: Kurt Carlson <snkac_at_java.sois.alaska.edu>
for the following:

What I have is in
        ftp://raven.alaska.edu/pub/sois/README.uakpacct
        ftp://raven.alaska.edu/pub/sois/man.uaklogin
  kit: ftp://raven.alaska.edu/pub/sois/uakpacct-v2.1.tar.Z

There is a sample script there, examples/ua_killer.ksh using the
uaklogin program. uaklogin combines ps & w for process display
with terminal attributes (like idle time).

For example:
        uaklogin -command *dxterm -idle 1440
would show all dxterm users who've been idle for 1440 minutes (24 hours).
What it doesn't have at the moment is a a -cpu filter.

If you just want a script, here's a start:

#!/bin/ksh
# this script is digital unix specific
#
integer HRS
ps -o user,pid,cputime,command -a |
grep -iv netscape |
while read user pid cpu command
do
        min=${cpu%:*} # strip off seconds
        hrs=${min%:*} # strip off minutes
        if [ "$hrs" = "$min" ]; then continue; fi
        HRS=0
        HRS=$hrs
        if [ 2 -gt $HRS ]; then continue; fi
        echo "$user $pid $cpu $command"
done

You'd need to add a command to "kill $pid" and logging of whatever
nature you want (don't just kill it without keeping track somehow...
append it to a flat log, send an email message to somebody, whatever).

I'll probably add a -cpu filter into uaklogin (and therefore ua_killer.ksh)
as I've been wanting that, probably won't get to that immediately... primarily
becase the ps CPU format varies by Unix implementation and I'd want to
identify all the Unixes I use (DU, Irix, Unicos) before releasing that code.
I'd have to check how DU represents days of CPU before knowing whether
the above would handle.

>On Mon, 6 Apr 1998, Kurt Carlson wrote:
>
>> >Does anyone have a simple script I can put in crontab to kill runaway
>> >processes?
>> >
>> >How does one specificially identify runaway processes other than CPU time
>> >being unreasonable and using 99% CPU cycles?
>>
>> Therein lies the problem... how does one reliably determine run-a-way
>> processes.
>>
>> >I want to be able to automatically kill netscape processes that have more
>> >than 2 hrs of CPU time but am not an avid shell programmer.
>>
>> I have a script & program which can take filters (such as command names
>> or userids, more typically userid excludes) to kill processes...
>> it's presently being used for killing certain idle users.
>> It does not include CPU thresh-hold as an option right now, but that's
>> something I could probably add (C code does the actual filtering).
>> The threshold would have to be some amount consumed (like 2 hours as
>> you suggest) not a rate and could be and'd with a command name.
>>
>> If you don't get another option let me know and I might add it.
>> Please summarize regardless. kurt
>>
>>
>> btw, more important is to fix the actual problem... *why* is it a
>> run-a-way. you can't fix netscape most likely, but i've seen several
>> run-a-ways caused by applications which issue gets to terminals
>> and don't check for errors (e.g., terminal has disconnected) and
>> then keep issuing gets repetitively... makes for a nasty system-cpu
>> intensive loop.
>>
>> _____________________________________________________________________
>> Kurt Carlson University of Alaska, ARSC snkac_at_java.sois.alaska.edu
>> (907)474-5763 910 Yukon Drive #108.84 Fairbanks, AK 99775-6200
>>
>>
>>


- Dan Kirkpatrick - dkirk_at_physics.syr.edu -
| Systems Administrator |
| Physics Department |
| Syracuse University, Syracuse, NY |
-------------------------------------------
Received on Thu Apr 09 1998 - 15:46:06 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:37 NZDT