Many thanks to all the helpful folks on the list, including
Alan Rollow
Dr Thomas Blinn
Hines, Bruce D
Brusche, Johan
Chris Bryant
No doubt others of you replied, but I haven't seen the messages
because the host is also our mail server, and I've had it down moving
the filesets to other disks today.
The question was:
I'm having problems on a fileserver, Digital Unix 4.0E on an Alphaserver
2100. Started Sunday in the early morning hours, happened again twice
Monday night/Tuesday morning. Many errors similar to the
following:
vmunix: AdvFS I/O error:
vmunix: Volume: /dev/rz16c
vmunix: Tag: 0xfffffffa.0000
vmunix: Page: 8404
vmunix: Block: 147088
vmunix: Block count: 16
vmunix: Type of operation: Read
vmunix: Error: 5
The volume, page and block vary but repeat; all are error 5 and nearly all
the operations are read. The three volumes are members of the same domain
(but all domains effectively become inaccessible).
Is this likely a disk hardware problem? Perhaps the domain needs to be
rebuilt? Any advice welcome.
Use of the commands
/sbin/advfs/verify
uerf -R -r 199 -o full -f /var/adm/binary.errlog > errs.txt
was most helpful. verify showed errors on 2 of about 20 filesets in
the domain; it would not successfully repair them. The uerf command
showed a large number of soft errors on one volume in the domain, and
a handful of hard errors. The disk is definitely failing. Things
were still stable enough to copy the filesets to new domains on newer
drives with vdump/vrestore. About 3 SCSI errors were reported during
the copy of ~53 GB.
==BD
Received on Tue Apr 08 2003 - 23:05:22 NZST