NetWorkers,
Please HELP !!! me out on this:
We have two DEC 3000 servers configured in a ASE cluster (running 4.0B and
TruCluster 1.4A).
One of the ASE hosts run NetWorker (DEC version 4.3-05/13) and is
responsible for backing up all the NFS services and other clients. The
problem is that backup halts "from time to time". Yes, I know that such a
phrase is not very specific but this is the very nature of the errors too.
I have checked with DIA (v2.6), no errors. I have checked with 'tapex' on
the DLT 877 autochanger in question (firmware DEC v6.0), no errors. I have
tried edX (One of DEC's tools called Error Detection eXpert), no errors. I
also have tried the NetWorker tool tapexercise, of course no errors.In
addition to this I have also tried rebooting both cluster servers. The
systems run apparently fine when no backup is running.
Pretty soon will my head fall off if I don't get any answers.
Here's a snip from /nsr/logs/daemon.log showing what happens (it is the
"standard" has been inactive for xx minutes since ..) messages.
* bgnfs6:/var/ase/mnt/bgnfs6/home/bgnfs6/dr3 has been inactive for 63
minutes since Thu Jan 29 15:07:22 1998.
* bgnfs6:/var/ase/mnt/bgnfs6/home/bgnfs6/dr3 is being abandoned by asavegrp.
1/29/98 16:10:17 savegrp: bgnfs6:/var/ase/mnt/bgnfs6/home/bgnfs6/dr3 will
retry 0 more times
The above snip is from a failing save of one of the NFS services, located on
a HSZ40 RAID array having AdvFS. I have also tried checking all the
HSZ40/50's but there are no apparent errors there neither.
The funny thing is that sometimes (yes, sometimes, not always) when I do a
nsr_shutdown, 'nsrmmd' just hangs and is impossible to kill. 'lsof' reports
that the nsrmmds are in CLOSE_WAIT but the portnumbers reported for the
client is nonexistant when checking on the client. In this situation a plain
'reboot' hangs the system. I have to hard-reset to get to console mode and
boot. I have tried forcing a crashdump but nothing in the crashlogs have
made me any wiser. Can you ?????
--
******************************************************************
* Knut Hellebų | DAMN GOOD COFFEE !! *
* Norsk Hydro a.s | (and hot too) *
* Phone: +47 55 996870, Fax: +47 55 996495 | *
* Cellular Phone: +47 93092402 | *
* E-mail: Knut.Hellebo_at_nho.hydro.com | Dale Cooper, FBI *
******************************************************************
Received on Fri Jan 30 1998 - 16:04:50 NZDT