SUMMARY: ADVS Error: bs_osf_complete

From: Jason Carter \(PE - CSD\) <"Jason>
Date: Mon, 03 Apr 2000 08:50:02 +0800

Thanks to everyone who responded to my plea for help. Original message below
and responses follow.
---------------------------------------------
"We are running DU 4.0E. The machine hung this morning, with the following
error in /usr/var/adm/messages. I saw similar errors when I searched the
archives but there appears to be no summary of a solution (perhaps there
isn't one). If anyone has resolved this problem in the past, I would be
grateful for your input.

Mar 28 07:35:26 ncalpha vmunix: AdvFS I/O error:
Mar 28 07:35:26 ncalpha vmunix: Volume: /dev/rz17a
Mar 28 07:35:26 ncalpha vmunix: Tag: 0xfffffffa.0000
Mar 28 07:35:26 ncalpha vmunix: Page: 3
Mar 28 07:35:26 ncalpha vmunix: Block: 8400
Mar 28 07:35:26 ncalpha vmunix: Block count: 16
Mar 28 07:35:26 ncalpha vmunix: Type of operation: Write
Mar 28 07:35:26 ncalpha vmunix: Error: 5
Mar 28 07:35:26 ncalpha vmunix: Tag: 0xfffffffa.0000
Mar 28 07:35:26 ncalpha vmunix: Page: 3
Mar 28 07:35:26 ncalpha vmunix: Block: 8400
Mar 28 07:35:26 ncalpha vmunix: Block count: 16
Mar 28 07:35:26 ncalpha vmunix: Type of operation: Write
Mar 28 07:35:26 ncalpha vmunix: Error: 5
Mar 28 07:35:26 ncalpha vmunix: ADVFS EXCEPTION
Mar 28 07:35:26 ncalpha vmunix: Module = msfs_config.c, Line = 505
Mar 28 07:35:26 ncalpha vmunix: bs_osf_complete: metadata write failed
Mar 28 07:35:26 ncalpha vmunix: device string for dump = SCSI 0 2009 0 1 100
0 0."
---------------------------------------------
>From Gareth Dempsey:
What Patch Kit are you on?
We experienced these type of errors for some time on 2 machines, an 8400 and
a ES40 both running 4.0e PK3.
We installed PK4 a week ago (which is supposed to have included a fix for
these errors) and so far no problems.

forgot to mention that it may be beneficial to run a verify on the domain
associated with /dev/rz17a.
e.g. un-mount the directory and run /sbin/advfs/verify -f 'fdmn name'

>From Bryan Lavelle:
I/O error 5's are a hardware problem. Something is wrong with your
underlying storage, rz17a.

>From Alex Gorbachev:
Run DECEvent (dia) to find out the exact reason for the error from the
binary errorlog. This could well be a h/w problem.

>From alan_at_nabeth.cxo.dec.com

        AdvFS tried to write to the indicated block of the indicated
        disk and it failed. Apparently it was metadata space as
        shown by the bs_osf_complete part of the message. This
        means that the file system is not inconsistent, so AdvFS
        paniced the system. I/O failures are nearly always the
        result of a problem on the I/O subsystem; a disk block
        gone bad, a disk gone bad, etc. If possible, reboot the
        system and arrange for that file system not be mounted at
        boot. Check the error log for error on the device to see
        what the problem is and take appropriate corrective action
        (replace the block, replace the disk, etc).

        Then see if you can mount the file system without the system
        crashing again.

        Now, if it happens that rz17a is the partition used for the
        root domain, the write failure could have been the result
        of a whole device or subsystem failure. Since the system
        dump space would be on the same disk (generally), the hang
        could be a sign that the whole device/subsystem had a
        problem.
---------------------------------------------
Everything seems to be pointing to hardware. I have no resolution as yet but
have some great info to start. Thanks guys!

Jason Carter
Network Administrator
Newcrest Mining Ltd
Ph: +61 8 9270 7093
Mb: 041 993 8578
Fx: +61 8 9270 7150
Received on Mon Apr 03 2000 - 00:48:51 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:40 NZDT