DECnsr client exits on signal 11

From: Jeffrey S. Jewett <spider_at_umd5.umd.edu>
Date: Thu, 19 Nov 1998 09:54:55 -0500

We run a Networker 4.2.7 server on an UltraSPARC running Solaris 2.5.

One of our Digital (now Compaq) Alpha clients, running DU 4.0d,
recently increased the size of one of their ADVFS volumes from 1.5 GB
to 14 GB. We discovered this when the full backup of that volume
died. The symptoms as revealed on the server were simply the message:

        * client:/export/data/eos3 ! no output

indicating that client save program had died. This occurred after about
10GB had been saved.

When we did "savegrp -n -l full -c client unix", we got the same
message. No diagnostic information on the client indicating a problem.

We determined that the client was running DECnsr version 4.2a, so we
requested that they upgrade to version 5.2 (BUILD 47), which they did.
They have also applied patch kit 2, which is rumored to address certain
advfs problems.

The full backup still dies, but with a slighty different error message.
Using "savegrp -v -n -l full -c client unix" gives (many extraneous
lines deleted):

        * client:/export/data/eos3 /export/data/eos3/.tags/
        * client:/export/data/eos3 /export/data/eos3/quota.user
        * client:/export/data/eos3 /export/data/eos3/quota.group
        * client:/export/data/eos3 /export/data/eos3/pickerin/precip/prat.ctl
        * client:/export/data/eos3 save: exited on signal 11

The signal 11 means (on DU systems) that a segmentation fault has occurred.


What is also interesting is that we *can* perform a level 5 backup, which
saves about 13 GB (what was added/modified since the last successful
full backup, which occurred when the filesystem was 1.5GB).

The number of files in the level 5 backup is about 5000, and the sysadm
of the client assures me that there are no files on eos3 > 2GB,
or anything else strange that he can perceive.

I suspect that there is something toxic on the filesystem which is being
stumbled across on full backups, but not on the level 5's, which means
that NetWorker thinks that it's not new since the last full backup,
which doesn't make a lot of sense.

Has anyone seen anything like this before, or have any ideas of
where to look next?

Thanks.

- jeff

---------------------------------------------------------------------------
Jeffrey S. Jewett Phone: 301/405-3054
Core Computing Services, AITS/OIT Fax : 301/314-9220
Rm 1311, Computer and Space Sciences Bldg Email: spider_at_umd5.umd.edu
University of Maryland
College Park, MD 20742-2411
Received on Thu Nov 19 1998 - 14:55:48 NZDT

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:38 NZDT