SUMMARY: Defunct processes from James Anderson on 1996-01-12 (tru64-unix-managers)

From: James Anderson <anderson_at_ocaxp1.cc.oberlin.edu>
Date: Thu, 11 Jan 1996 13:46:25 -0500 (EST)

  My thanks to:

  swatson_at_ultrix6.cs.csubak.edu
  trulsonj_at_mscd.edu
  matthewm_at_sgate.com
  yanowitz_at_eternity.cs.umass.edu
  svardhan_at_siwest.cts.com

  For the rundown on what defunct processes are and what can be done if they
linger. Here's a synopsis of what I found out, my thanks to Sean Watson for
this write up. (They were all great writeup, I flipped a coin as to which on
to use...)

Jim

> Can some one give some insight into a defunct process:
>
> 1. What is it?
>
This is what happens when a process dies:
  (1) All open files, sockets, shared memory, etc are released (hopefully)
  (2) All non-shared memory (data/text/bss segments)
  (3) All child process are reparented by pid 1 (init)
  (4) The exit value is stored in the process table.
  (5) A SIGCHLD is sent to the parent (pid 1 if its true parent has exited)

      (At this point, the only resouuces the process is taking is its slot
       in the process table. The process is "defunct" (aka a "zombie")
       and may stay that way for an indefinite amount of time until....

  (6) The parent (or pid 1) calls wait() (or waitpid(), etc), collects
      the exit value, resources (cpu time, memory usage, etc) are accounted
      against the parent, and the process slot is freed.

It's a process slot entry that has not been reaped from its parent yet. The
actual process is dead and is not taking up any system resources -- other
than the process table entry.

> 2. What causes it?
>
A parent may be coded sloppily and not call wait when it should --or-- the
parent may be busy and not want to wait on the child --or-- the system
may be so busy the parent doesn't get to run for a while.

> 3. How to prevent it?
>
Write better code; run fewer processes; don't sweat it when it happens.

> 4. How to kill them?
>
You can kill their parents which will force init (pid 1) to inherit them
and wait on them (probably promptly). Waitpid can only be called on a
processes direct children (not even grandchildren can be waited on), so
you must either force a parent to wait on its children or kill the parent.

Mostly, it is safe to ignore zombies as long as you don't have thousands
of them. I seem to recall a server that listened at a socket, when it
got a connection it forked a child, to handle the connection, waited on
any children that had exited, and went back to listening at the socket. The
result was there was always one single zombie but the server always
spent as much time as possible listening at its socket and only dealt with
waiting on children after the current connection was taken care of.

I think this was a little extreme, but then again, I don't think there was
anything particularly wrong with it.

----
_______________________________________________________________________________
James C. Anderson                      PHONE: (216) 775-6929
Houck Computing Center                   FAX: (216) 775-8573
Oberlin College                        Email: anderson_at_ocaxp1.cc.oberlin.edu
Oberlin, OH 44074                  Home Page: http://www.oberlin.edu/~anderson/
_______________________________________________________________________________

Received on Thu Jan 11 1996 - 20:24:51 NZDT

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:46 NZDT