AdvFS inconsistencies: how do I fix it?

From: Markus Baur <mbaur_at_ira.uka.de>
Date: Mon, 29 Jan 96 16:30:50 +0100

We have an HSZ40 RAID array connected to two sables equipped with KZPSA
SCSI controllers. Both servers are running DU 3.2C with ASE 125 and the
Advfs patch 'OSF350-070'.

Now I have an AdvFS inconsistency on one of the domains.

Does anybody know how one produces such filesystem inconsistency? There was
apparently no crash and I thought AdvFS to be more safe than ufs? If my bad
experiences with AdvFS (lots of kernel panics, inconsistencies...) continue,
I have to consider switching back our ~100 GB back to ufs. I hate to do that,
but for us stability is more important than flexibility. I hoped AdvFS would
give us both but that seem to be wrong promises from the DEC marketing
people.

The inconsistency produces messages like the following when accessing certain
directories:

Jan 29 15:52:17 speech2 vmunix: advfs I/O error: setId 0x30cdebda.0000b384.fffff
ffe.0000 tag 0xfffffffa.0000u page 85
Jan 29 15:52:17 speech2 vmunix: vd 1 blk 7408 blkCnt 16
Jan 29 15:52:17 speech2 vmunix: read error = 5
Jan 29 15:52:17 speech2 vmunix: advfs I/O error: setId 0x30cdebda.0000b384.fffff
ffe.0000 tag 0xfffffffa.0000u page 131
Jan 29 15:52:17 speech2 vmunix: vd 1 blk 1101248 blkCnt 16
Jan 29 15:52:17 speech2 vmunix: read error = 5


My question is: How can I fix such inconsistencies while the machine is
running? Rebooting is not a good solution for these machines because they
run jobs which take DAYS to produce results and their load rarely falls
below 4.

I already tried 'msfsck' which reports an error but doesn't fix it:

Checking disks...
check_disk: checking storage allocated on disk /dev/rza26g
check_disk: bad bmt read
    errno: 5, I/O error

check_disk: checking storage allocated on disk /dev/rza26d

check_disk: checking storage allocated on disk /dev/rza26e

check_disk: checking storage allocated on disk /dev/rza26f


Checking mcell list...

Checking mcell position field...

Checking tag directories...
check_tagdir_page: in-use on-disk tag does not have matching in-mem tag hdr
    set tag: 1.32769 (0x00000001.0x00008001)
    tag: 114367.32771 (0x0001bebf.0x00008003)
    tag directory page: 111
    tag map entry: 925
    seqNo: 0x8003 (32771)
    vdIndex: 1
    bfMCId (page.cell): 191.27

[...]

Unmounting and remounting doesn't report any problems but doesn't
fix my problem too.

Any hint is appreciated.

- Markus
Received on Mon Jan 29 1996 - 16:58:59 NZDT

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:46 NZDT