Got one response from Dr Tom Blinn, his response is below my original
note......
The EMC engineer insisted that there was no hardware errors on their box,
so at the very least I decided to install PK5 which I know has some patches
for handling metadata mcell problems and such like.
I don't know if EMC disks are worth the hassle as there are no effective
mapping utilities to map what the EMC box sees and what the UNIX side sees.
Also, as Tom points out, it is UNSUPPORTED. As I inherited the system I
have to live with it though.......
I wrote
=======
> Hi all,
>
> I received the following error on an Alpha Server 4100 running Tru64 4.0D
> patch kit 3: -
>
> Jan 21 01:45:15 pmrsvr01 vmunix: AdvFS I/O error:
> Jan 21 01:45:15 pmrsvr01 vmunix: Volume: /dev/rz27c
> Jan 21 01:45:16 pmrsvr01 vmunix: Tag: 0xfffffff7.0000
> Jan 21 01:45:16 pmrsvr01 vmunix: Page: 169
> Jan 21 01:45:16 pmrsvr01 vmunix: Block: 4976
> Jan 21 01:45:16 pmrsvr01 vmunix: Block count: 16
> Jan 21 01:45:16 pmrsvr01 vmunix: Type of operation: Write
> Jan 21 01:45:16 pmrsvr01 vmunix: Error: 5
>
> This failed the domain that the above volume was in and I could not get it
> back using either verify or salvage. I had to destroy and recreate it.
>
> I have since installed patch kit 5 but never got a really satisfactory
> answer as to whether there was a specific patch for the above in patch kit
> 5. The volume in question is an EMC storage disk and I have been advised
> that the disk itself has no I/O errors so it appears to be a
software/AdvFS
> related problem.
>
> So... can any one out there tell me if there is any specific patches in
PK5
> for this error or is it in one of the 'various fixes' catch all
patches????
>
> Regards,
>
> Dave Stapleton
> PS INFORMATION RESOURCE IRL LTD
Dr Tom Blinn wrote
===========================================
Excuse me? If you examine /usr/include/errno.h you will see this line:
#define EIO 5 /* I/O error */
That's what "vmunix: Error: 5" means. The AdvFS file system code was
trying to do a write to Block 4976 for a Block count of 16 and it got
an I/O error reported by the SCSI subsystem layer. That's a HARDWARE
problem. Hardware failures on writes will ALWAYS fail the domain, and
depending on what data could not be written (probably some kind of file
system metadata), you may not be able to recover the domain.
No software patch is going to fix this, unless it just happens to make
your UNSUPPORTED EMC storage hardware work more reliably. I don't know
who advised you that "the disk itself has no I/O errors" or what might
have been the basis for that conclusion, but an I/O error on /dev/rz27c
was what blew away your AdvFS domain. The only way to tell much about
this is to examine the binary error log for the period of time when you
were hit by the error; the text log you reported for the console shows
the AdvFS symptoms but they were caused by a hardware glitch.
Tom
Dr. Thomas P. Blinn + UNIX Software Group + Compaq Computer Corporation
110 Spit Brook Road, MS ZKO3-2/W17 Nashua, New Hampshire 03062-2698
Technology Partnership Engineering Phone: (603) 884-0646
Internet: tpb_at_zk3.dec.com Digital's Easynet: alpha::tpb
ACM Member: tpblinn_at_acm.org PC_at_Home: tom_at_felines.mv.net
Worry kills more people than work because more people worry than work.
Keep your stick on the ice. -- Steve Smith ("Red Green")
My favorite palindrome is: Satan, oscillate my metallic sonatas.
-- Phil Agre, pagre_at_alpha.oac.ucla.edu
Yesterday it worked / Today it is not working / UNIX is like that
-- apologies to Margaret Segall
Opinions expressed herein are my own, and do not necessarily represent
those of my employer or anyone else, living or dead, real or imagined.
Regards,
Dave Stapleton
PS INFORMATION RESOURCE IRL LTD
E-MAIL - dave.stapleton_at_psir.ie
Tel. : +(353) 1 217 7000
********************************************************************
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please
notify us immediately at MailManager_at_psir.ie and delete this E-mail
from your system. Thank you.
It is possible for data transmitted by email to be deliberately or
accidentally corrupted or intercepted. For this reason, where the
communication is by email, the Bank of Ireland Group does not accept
any responsibility for any breach of confidence which may arise
through the use of this medium.
This footnote also confirms that this email message has been swept
by MIMEsweeper for the presence of known computer viruses.
********************************************************************
Received on Mon Jan 24 2000 - 16:45:14 NZDT