SUM: Disk error message during dump from Colin Brooks on 1997-11-25 (tru64-unix-managers)

From: Colin Brooks <cbrooks_at_nature.berkeley.edu>
Date: Mon, 24 Nov 1997 17:44:41 -0800

Hi folks. I got 2 replies to my message last week about disk error messages
during dump. Basically, I was warned about having bad blocks on my hard
disk, and I should be concerned. My hard disk now seems fine, but I
appreciate the hints on making sure this is true.

Susan Rodriguez said:
You may have a corrupted disk. Umount your filesystem and run fsck with
the -o flag to force it. It should try to repair any damage.
You can also run the scu utility:
#scu -f /dev/rrz6e
scu> verify media
If you fail these tests, you have bad sectors on the disk.

Alan Rollow said:
ou have a small number of bad blocks on the disk. You should
be concerned. Once upon a time flurries of bad blocks were an
indication that disk might be going bad. Today, this seems
less true, but having a block means that the data in the block
was unreadable.

The appropriate thing to do (for ordinary SCSI disks) is use
scu(8) to reassign the block, then use the block number to
backtrack to the file. For UFS you can do this with icheck
and ncheck. Once you know what file is corrupt, you can
restore it from a previous backup.

I don't know what tools are available though the "re" disk
driver to fix bad blocks.

The HSZ SCSI controllers doesn't implement the "Reassign LBA"
command, so you have to track down the file, delete it and
restore it from backup. Writing good data to the block will
cause the controller to fix it.

I believe the SCSI disk driver will try fairly to read an
apparent bad block and if it gets a good copy of the data
will do the reassign itself and write the data back. The
driver reassign support may do the same thing, or it may
just pass the command through.

Many drives will do automatic revectoring of bad blocks, but
they have no way to tell the host whether the data in them
is good or bad. Hosts generally prefer that such support
be disabled, but I'm told they don't go out of their way to
do so.

Looking at the system error log with uerf(8) or dia(8) will
offer more information about what the driver did. If using
uerf(8) use the "-o full" to get all the device error data.

*********************
My original question:
When checking the mail that the "at" commands sends me after completing my
schedule system dumps, I found the following for my second disk/filesystem:

dump: Mapping (Pass I) [regular files]
dump: Bad read from input disk file: /dev/rrz6e, block number: 173584, bytes
wanted: 8192, bytes got: -1
bread(): read(): I/O error
dump: Bad read from input disk file: /dev/rrz6e, block number: 208320, bytes
wanted: 8192, bytes got: -1
bread(): read(): I/O error
dump: Mapping (Pass II) [directories]

(The complete message from this part of the dump is below). Should I be
concerned? Any idea what is going on here? The error message did not
appear during the next dump of the same filesystem. Anything I can or
should do to fix this?

*********************
Thanks for everyone's help,
Colin

****************************************************
Colin Brooks
GIS Analyst
Integrated Hardwood Range Management Program
Hopland Research & Extension Center
4070 University Rd.
Hopland, CA 95449
Primary #'s - TEL:(707)744-1270 FAX:(707)744-1040
Work E-mail: cbrooks_at_nature.berkeley.edu
http://danr.ucop.edu/ihrmp
http://www.pacific.net/~cbrooks/gis1.shtml
I'm also found at:
ESPM, University of California - Berkeley
160 Mulford Hall
Berkeley, CA 94720-3114
Secondary #'s - TEL: 510-643-1136 FAX: 510-643-5438
****************************************************
Received on Tue Nov 25 1997 - 03:07:51 NZDT

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:37 NZDT