SUMMARY Advfs I/O Error - Hardware or software?

From: Gavin Kreuiter <gavink_at_ust.co.za>
Date: Tue, 06 Apr 1999 13:00:09 +0200

Hi Managers

Thanks to replies from John Speno [speno_at_isc.upenn.edu] and
alan_at_nabeth.cxo.dec.com. Consensus was a hard error, as confirmed by
uerf with -o full [Thanks Alan]:

----- CAM STRING -----
ERROR TYPE Hard Error Detected
MEDIUM ERROR - Nonrecoverable medium_error

Will be replacing the disk tomorrow.

Original question:

> I have seen a few questions like mine in the archives, but cannot find
a
> satisfactory solution. While running defrag, the following messages
are
> reported:
>
> Apr 1 01:08:27 Switch1 vmunix: AdvFS I/O error:
> Apr 1 01:08:27 Switch1 vmunix: Domain#Fileset: spare_dmn#spare
> Apr 1 01:08:27 Switch1 vmunix: Mounted on: /usr/home
> Apr 1 01:08:27 Switch1 vmunix: Volume: /dev/rz18c
> Apr 1 01:08:27 Switch1 vmunix: Tag: 0x00000001.8001
> Apr 1 01:08:27 Switch1 vmunix: Page: 6496
> Apr 1 01:08:27 Switch1 vmunix: Block: 4608544
> Apr 1 01:08:27 Switch1 vmunix: Block count: 128
> Apr 1 01:08:27 Switch1 vmunix: Type of operation: Read
> Apr 1 01:08:27 Switch1 vmunix: Error: 5
>
> The same block has been reported consistently over the last few days,
> always while defragment is running.
>
> Is this hardware or software corruption? Is is something that should
be
> fixed?
>
> verify reports NO errors

Alan's reply:
Almost certainly hardware. You can check the error log to be sure. For
V4 systems, you should use DECevent to format the error log, but if you
use uerf(8), use the option "-o full". The good news is that the bad
block is probably in a file. There is a utility to convert the tag
number to a file name. I don't recall the name, but it will have the
word "tag" in it. Verify probably doesn't find it because it probably
doesn't read file data.

Once you know the file name, you can figure out how to restore it;
from backup, regenerate it from source, etc. Simply overwriting it
with a good copy may be enough to fix the error. Once the driver
has a good copy of data to write to it, it may perform the replacement
automatically. If you not, you can force a replacement with scu(8)
for SCSI disks and radisk(8) for MSCP disks. For controllers of
"re" disks check their documentation to see how to replace bad blocks.
------------------------------------------------------------------------
----------------
Gavin Kreuiter gavink_at_ust.co.za
Unihold Technologies www.ust.co.za
(W) +27 (11) 709-7004
(F) +27 (11) 709-1010
Received on Tue Apr 06 1999 - 11:16:53 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:39 NZDT