SUMMARY: uninteruptable process weirdness

From: Richard A. Muirden <richard_at_RMIT.EDU.AU>
Date: Thu, 2 Feb 1995 12:22:28 +1100 (EDT)

Hi all,

A few days ago, I wrote:

------
Hi managers,

OSF/1 V3.0B (358.78) though saw the same thing under V3.0 (347):

On our AlphaServer 2100 4/200, after some days uptime processes decide
to go into U (un-interuptable) mode, and can't be killed. This has a nasty
side effect that it seems to happen to wall, so:

a) can't wall to people, it just hangs after accepting the message
b) stuffs up shutdown, which waits for wall to terminate, which doesn't,
   so you can't easily do a clean shutdown of the machine, let alone
   warn the users

However after a reboot the system is fine, and wall is fine and there
isn't any weirdness.. It only seems to happen after a few days uptime.
Permissions seem normal enough - can't figure it out.

Also we see a large number of defunct processes... is this normal
or another quirk?

Should I call this in to DEC as a software bug, or is it 'known' ? it
seems kind of hard to prove...

-richard
-----

Since then I got a number of replies (which are below) and a patch after
calling it into DEC. I am going to try out the patch over the next few
days, but as one of the replies I got to my message stated it is a 'known'
problem with OSF/1 - DEC had a ready-made patch for me to install. Basically
all the problems are related to the large number of defunct processes -
wall is hanging because it is trying to write to these processes, which
it can't, and shutdown is going nanas because of wall. The patch hopefully
fixes the problem with the defunct processes.

--- summary email:

Date: Tue, 31 Jan 1995 10:51:57 -0500 (EST)
From: Saul Tannenbaum <stannenb_at_emerald.tufts.edu>

On Tue, 31 Jan 1995, Richard A. Muirden wrote:

> However after a reboot the system is fine, and wall is fine and there
> isn't any weirdness.. It only seems to happen after a few days uptime.
> Permissions seem normal enough - can't figure it out.
>
> Also we see a large number of defunct processes... is this normal
> or another quirk?
>
> Should I call this in to DEC as a software bug, or is it 'known' ? it
> seems kind of hard to prove...

Wall: Please, please, report it. I did and they told me they never heard
of it and demanded that I provide them with either access to the machine
in that state or a crash dump before they would consider my report.

Defunct processes: There's a patch. I'm sorry I don't have a reference
to which one.
-- 
Saul Tannenbaum, Manager, Academic Systems | "It's still rocket  
                stannenb_at_emerald.tufts.edu |    science" - Vint Cerf
Tufts University Computing and             |
                Communications Services    |
----
Hi Richard,
this is a "me too" message. The problem with hanging wall etc.
is appearing on our sables too. We have the additional effect,
that our console is hanging too... We are using an old-fashioned
vt220 as console, normally working well...
We are *very* interested in a summary, if you learn something
about this.
Bernt
-------------------------------------------------------------------------
- Bernt Christandl / Max Planck Institut fuer Extraterrestrische Physik -
- D-85740 Garching  / Phone: +49/89/3299-3346 / Fax: +49/89/3299-3569   - 
- Internet: beb_at_mpe-garching.mpg.de                                     - 
-------------------------------------------------------------------------
-- 
Richard A. Muirden, Sys. Admin |Fan of Shostakovich, "Star Trek" and the Boeing
Mailto: richard_at_rmit.EDU.AU    |777 (launch: May 15, 1995 - United Airlines).  
Phone: (+61 3) 660 3814        |I created alt.fan.shostakovich! Fly: UA,QF,WN
http://www.rmit.edu.au/richard |Can *YOU* beat my 105 Shost CD's? :-)
  * 1995: Remembering 20 years since the death of Shostakovich (1906-75) *
                    Member of the Al Roker Fan Club!
Received on Wed Feb 01 1995 - 20:22:30 NZDT

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:45 NZDT