DECnsr (NetWorker) 4.3 problems

From: Knut Hellebų <Knut.Hellebo_at_nho.hydro.com>
Date: Fri, 30 Jan 1998 16:04:30 +0100

NetWorkers,

Please HELP !!! me out on this:

We have two DEC 3000 servers configured in a ASE cluster (running 4.0B and
TruCluster 1.4A).

One of the ASE hosts run NetWorker (DEC version 4.3-05/13) and is
responsible for backing up all the NFS services and other clients. The
problem is that backup halts "from time to time". Yes, I know that such a
phrase is not very specific but this is the very nature of the errors too.
I have checked with DIA (v2.6), no errors. I have checked with 'tapex' on
the DLT 877 autochanger in question (firmware DEC v6.0), no errors. I have
tried edX (One of DEC's tools called Error Detection eXpert), no errors. I
also have tried the NetWorker tool tapexercise, of course no errors.In
addition to this I have also tried rebooting both cluster servers. The
systems run apparently fine when no backup is running.
Pretty soon will my head fall off if I don't get any answers.

Here's a snip from /nsr/logs/daemon.log showing what happens (it is the
"standard" has been inactive for xx minutes since ..) messages.

* bgnfs6:/var/ase/mnt/bgnfs6/home/bgnfs6/dr3 has been inactive for 63
minutes since Thu Jan 29 15:07:22 1998.
* bgnfs6:/var/ase/mnt/bgnfs6/home/bgnfs6/dr3 is being abandoned by asavegrp.
 1/29/98 16:10:17 savegrp: bgnfs6:/var/ase/mnt/bgnfs6/home/bgnfs6/dr3 will
retry 0 more times

The above snip is from a failing save of one of the NFS services, located on
a HSZ40 RAID array having AdvFS. I have also tried checking all the
HSZ40/50's but there are no apparent errors there neither.
The funny thing is that sometimes (yes, sometimes, not always) when I do a
nsr_shutdown, 'nsrmmd' just hangs and is impossible to kill. 'lsof' reports
that the nsrmmds are in CLOSE_WAIT but the portnumbers reported for the
client is nonexistant when checking on the client. In this situation a plain
'reboot' hangs the system. I have to hard-reset to get to console mode and
boot. I have tried forcing a crashdump but nothing in the crashlogs have
made me any wiser. Can you ?????
-- 
      ******************************************************************
      *         Knut Hellebų                     | DAMN GOOD COFFEE !! *
      *         Norsk Hydro a.s                  | (and hot too)       *
      * Phone: +47 55 996870, Fax: +47 55 996495 |                     *
      * Cellular Phone: +47 93092402             |                     *
      * E-mail: Knut.Hellebo_at_nho.hydro.com       | Dale Cooper, FBI    *
      ******************************************************************
Received on Fri Jan 30 1998 - 16:04:50 NZDT

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:37 NZDT