Further to my earlier message, I have added some additional infirmation re
decevent analisys. Please discard the earlier message and use this instead.
Many thanks
Tony
-------------------updated message
follows---------------------------------------
Hi all... I have a 4100, dunix 4.0d + jumbo 2.
I have seen the following in the messages file. Having checked the console
log, there are a lot of these over the past couple of days. This is an
extract only:
Apr 21 02:32:45 montst1 vmunix: AdvFS I/O error:
Apr 21 02:32:45 montst1 vmunix: Domain#Fileset: vol18#db
Apr 21 02:32:45 montst1 vmunix: Mounted on: /u18
Apr 21 02:32:45 montst1 vmunix: Volume: /dev/vol/oracle/vol18
Apr 21 02:32:45 montst1 vmunix: Tag: 0x00000015.8001
Apr 21 02:32:46 montst1 vmunix: Page: 0
Apr 21 02:32:46 montst1 vmunix: Block: 3289040
Apr 21 02:32:46 montst1 vmunix: Block count: 16
Apr 21 02:32:46 montst1 vmunix: Type of operation: Write
Apr 21 02:32:46 montst1 vmunix: Error: 5
Apr 21 02:32:46 montst1 vmunix: To obtain the name of the file on which
Apr 21 02:32:46 montst1 vmunix: the error occurred, type the command:
Apr 21 02:32:46 montst1 vmunix: /sbin/advfs/tag2name /u18/.tags/21
Apr 21 02:32:47 montst1 vmunix: io/vol.c(volerror): Uncorrectable write
error on
volume vol18, plex vol18-01, block 7184
Apr 21 02:32:47 montst1 vmunix: AdvFS I/O error:
Apr 21 02:32:47 montst1 vmunix: Volume: /dev/vol/oracle/vol18
Apr 21 02:32:47 montst1 vmunix: Tag: 0xfffffff7.0000
Apr 21 02:32:47 montst1 vmunix: Page: 306
Apr 21 02:32:47 montst1 vmunix: Block: 7184
Apr 21 02:32:48 montst1 vmunix: Block count: 16
Apr 21 02:32:48 montst1 vmunix: Type of operation: Write
Apr 21 02:32:48 montst1 vmunix: Error: 5
Apr 21 02:32:48 montst1 vmunix:
Apr 21 02:32:48 montst1 vmunix: bs_osf_complete: metadata write failed
Apr 21 02:32:48 montst1 vmunix: AdvFS Domain Panic; Domain vol18 Id
0x3625faef.0
008b366
Apr 21 02:32:48 montst1 vmunix: An AdvFS domain panic has occurred due to
either
a metadata write error or an internal inconsistency. This domain is being
rende
red inaccessible.
Apr 21 02:32:48 montst1 vmunix: Please refer to guidelines in AdvFS Guide to
Fil
e System Administration regarding what steps to take to recover this domain.
Apr 21 02:32:48 montst1 vmunix: AdvFS I/O error:
Apr 21 02:32:48 montst1 vmunix: Volume: /dev/vol/oracle/vol18
Apr 21 02:32:48 montst1 vmunix: Tag: 0xfffffffa.0000
Apr 21 02:32:48 montst1 vmunix: Page: 1
Apr 21 02:32:48 montst1 vmunix: Block: 48
Apr 21 02:32:48 montst1 vmunix: Block count: 16
Apr 21 02:32:48 montst1 vmunix: Type of operation: Read
Apr 21 02:32:48 montst1 vmunix: Error: 5
Apr 21 04:00:01 montst1 vmunix: AdvFS I/O error:
Apr 21 04:00:02 montst1 vmunix: Volume: /dev/vol/oracle/vol20
Apr 21 04:00:02 montst1 vmunix: Tag: 0xfffffffa.0000
Apr 21 04:00:02 montst1 vmunix: Page: 1
Apr 21 04:00:02 montst1 vmunix: Block: 48
Apr 21 04:00:02 montst1 vmunix: Block count: 16
Apr 21 04:00:02 montst1 vmunix: Type of operation: Read
Apr 21 04:00:02 montst1 vmunix: Error: 5
Apr 21 04:00:02 montst1 vmunix: AdvFS I/O error:
Apr 21 04:00:02 montst1 vmunix: Volume: /dev/vol/oracle/vol20
Apr 21 04:00:02 montst1 vmunix: Tag: 0xfffffffa.0000
Apr 21 04:00:02 montst1 vmunix: Page: 1
Apr 21 04:00:02 montst1 vmunix: Block: 48
Apr 21 04:00:02 montst1 vmunix: Block count: 16
Apr 21 04:00:02 montst1 vmunix: Type of operation: Read
Apr 21 04:00:02 montst1 vmunix: Error: 5
This machine crashed the other day. Listed below is a console log extract
of the point in time of the crash:
07:19:17 # AdvFS I/O error:
07:19:30 Volume: /dev/vol/oracle/vol18
07:19:30 Tag: 0xfffffff7.0000
07:19:30 Page: 277
07:19:31 Block: 6720
07:19:31 Block count: 16
07:19:31 Type of operation: Read
07:19:31 Error: 5
07:19:31 ADVFS EXCEPTION
07:19:31 Module = ms_logger.c, Line = 3628
07:19:31 log_flush_sync: pinpg error
07:19:31 N1 = 5
07:19:31 panic (cpu 0): log_flush_sync: pinpg error
07:19:31 N1 = 5
07:19:31 syncing disks... 4 4
07:19:33
07:19:33 LSM attempting to dump to SCSI device unit number rz8
07:19:33 DUMP: 15758112 blocks available for dumping.
07:19:33 DUMP: 81954 wanted for a partial compressed dump.
07:19:33 DUMP: Allowing 7874958 of the 7879054 available on 0x804001
07:19:33 device string for dump = SCSI 1 3 0 0 0 0 0.
07:19:33 DUMP.prom: dev SCSI 1 3 0 0 0 0 0, block 500000
07:19:33 DUMP: Header to 0x804001 at 7879054 (0x78398e)
07:19:53 device string for dump = SCSI 1 3 0 0 0 0 0.
07:19:53 DUMP.prom: dev SCSI 1 3 0 0 0 0 0, block 500000
07:19:53 DUMP: Dump to 0x804001: .........: End 0x804001
07:20:25 device string for dump = SCSI 1 3 0 0 0 0 0.
07:20:25 DUMP.prom: dev SCSI 1 3 0 0 0 0 0, block 500000
07:20:25 DUMP: Header to 0x804001 at 7879054 (0x78398e)
07:20:45 succeeded
07:20:45 halted CPU 1
07:20:46 CP - SAVE_TERM routine to be called
07:20:47 CP - SAVE_TERM exited with hlt_req = 1, r0 = 00000000.00000000
07:20:47
07:20:47 halted CPU 0
07:20:47
07:20:47 halt code = 5
07:20:47 HALT instruction executed
07:20:47 PC = fffffc00004b29d0
07:20:47 P00>>>
07:20:55 P00>>>
Heres what I know:
1. Currently there are no mounted file sets on /dev/vol/oracle/vol18 or
/vol20 (there is usually).
2. A "volprint -g oracle -ht" indicates no LSM problems at all.
3. Nothing has come out of the volwatch daemon.
4. If I try the '/sbin/advfs/tag2name /u18/.tags/21', I get a:
tag2name: statfs(/u18/.tags/21) error; [13] Permission denied
Even though I am root. I guess this is partly due to the file set NOT being
mounted???
5. Attempts to mount file sets on these volumes fail:
vol20#db22 on /u22: Device busy
vol20#db21 on /u21: Device busy
vol18#db on /u18: Device busy
4. I have tried to verify both domains, but dont get very far:
#/sbin/advfs/verify vol18
verify: can't get set info for domain 'vol18'
verify: error = I/O error
+++ Domain verification +++
main: unable to get info for domain 'vol18'
error: 5, I/O error
montst1.vodafone_ip_ROOT> /sbin/advfs/verify vol20
verify: can't get set info for domain 'vol20'
verify: error = I/O error
+++ Domain verification +++
main: unable to get info for domain 'vol20'
error: 5, I/O error
5. fuser on the mount points (/u18, /u20 & /u21) show nothing - probably as
they are not mounted???
6. Attempts to salvage domains vol18 & vol20 dont generate much encouraging
info:
#cd /tmp
#mkdir salvage
#cd salvage
#pwd
/tmp/salvage
#/sbin/advfs/salvage vol18
salvage: Domain to be recovered 'vol18'
salvage: Volume(s) to be used '/dev/vol/oracle/vol18'
salvage: Files will be restored to '.'
salvage: Logfile will be placed in './salvage.log'
salvage: open() failed for Device '/dev/vol/oracle/vol18'
salvage: Device busy
salvage: No valid volumes found
#/sbin/advfs/salvage vol20
salvage: Domain to be recovered 'vol20'
salvage: Volume(s) to be used '/dev/vol/oracle/vol20'
salvage: Files will be restored to '.'
salvage: Logfile will be placed in './salvage.log'
salvage: open() failed for Device '/dev/vol/oracle/vol20'
salvage: Device busy
salvage: No valid volumes found
7. The disks themselves look ok - no orange lights etc. The HSZ70
(dual-redundant pair) show no problems with the disks.
8. I have analysed the binary error log with a:
#dia -o brief -t s:02:30:00 > /somefilename
This did not say much at all (4 errors logged after 02:30am). However a
'full' analisys shows a lot of stuff (that I cant interpret) but does also
show the following:
ASCII Message
io/vol.c(volerror): Uncorrectable write error on volume vol18, plex
vol18-01, block 3289040
ASCII Message
AdvFS Domain Panic; Domain vol18 Id 0x3625faef.0008b366
An AdvFS domain panic has occurred due to either a metadata write error
or
an internal inconsistency. This domain is being rendered inaccessible.
Please refer to guidelines in AdvFS Guide to File System Administration
regarding what steps to take to recover this domain.
I have not yet rebooted as its a production (well was) environment. I guess
a reboot wont help anyhow.
Any ideas how to proceed? I am hoping NOT to have to delete the domains,
recreate and restore data - at least not until a last resort.
Your help and advise would be appreciated.
Best regards - Tony
+-----------------------------------------------------------------+
| TONY MILLER - Systems Projects - VODAFONE LTD, Derby House, |
| Newbury Business Park, Newbury, Berkshire. |
+-------------+---------------------------------------------------+
| Phone | 01635-507687(local) |
| Work email | ANTHONY.MILLER_at_VF.VODAFONE.CO.UK |
| X.400 | G=ANTHONY; S=MILLER; C=GB; A=GOLD 400; P=VODAFONE |
| FAX | 01635-583856 |
+-------------+---------------------------------------------------+
Disclaimer: Opinions expressed in this mail are my own and do not
reflect the company view unless explicitly stated. The information
is provided on an 'as is' basis and no responsibility is accepted for
any system damage howsoever caused.
Received on Wed Apr 21 1999 - 10:49:15 NZST