disk I/O error, bad sector?

From: Ronald D. Bowman <rdbowma_at_tsi.clemson.edu>
Date: Mon, 08 Jun 1998 18:56:29 -0400

Hello Managers -

Recently we started havin problems with NSR. We were
receiving the following error during backups, and I believe
they were causing our system to mark the backup tapes as full
when in reality they were not.

* tsi:/space /space/mil3/4.0.A/sys/doc/online/modeler/1.0_06_1_Cct.pdf: I/O * error

They started after applying
a new software package to the system. It appears that 6
files are causing an i/o error from the disk.

from the kern.log file in syslog.dated we see the following :
Jun 8 09:17:55 tsi vmunix: Hard Error Detected
Jun 8 09:17:55 tsi vmunix: EXABYTE EXB-8505 ^XEXB-8505
Jun 8 09:17:55 tsi vmunix: cam_logger: CAM_ERROR packet
Jun 8 09:17:55 tsi vmunix: cam_logger: bus 0 target 0 lun 0
Jun 8 09:17:55 tsi vmunix: cdisk_check_sense
Jun 8 09:17:55 tsi vmunix: Medium Error bad block number: 8066678
Jun 8 09:17:55 tsi vmunix: Hard Error Detected
Jun 8 09:17:55 tsi vmunix: Quantum XP34300W ^XXP34300W
Jun 8 09:17:55 tsi vmunix: Active CCB at time of error
Jun 8 09:17:55 tsi vmunix: CCB request completed with an error
Jun 8 09:17:55 tsi vmunix: Error, exception, or abnormal condition
Jun 8 09:17:55 tsi vmunix: MEDIUM ERROR - Nonrecoverable medium error

upon running dump on our system we saw that
problems occurred in our /space partition.

dump: Bad read from input disk file: /dev/rrz0h, block number: 3670304, bytes
wanted: 8192, bytes got: -1
bread(): read(): I/O error

running diskx on that
partition resulted in 3 read errors out of 512 MBytes(partition is using
about 715 Mbytes).

A search on the archives list provided some insight as to what we should do,
but I want to make sure that I do not make any mistakes on trying to correct
this problem.

Our system is running UFS with the following in /etc/fstab:
/dev/rz0a / ufs rw 1 1
/proc /proc procfs rw 0 0
/dev/rz0g /usr ufs rw 1 2
/dev/rz0h /space ufs rw 1 2
/dev/rz0b swap1 ufs sw 0 2

1) we want to umount the /space partition.
Q: Can this only be done in single user mode? I plan on using the
command /usr/sbin/umount /space
is this correct?

2) once unmounted, we want to perform fsck -o -v /dev/rrzoh
there was also mention of using icheck and ncheck in the
previous summary, but icheck has been replaced by fsck and
ncheck I assume may still be useful.

3) There is also the possibility of running scu -f /dev/rrz0h,
scu> verify media
to check the disk which I believe can be run under normal conditions
as root.

4) I cannot provide any uerf or dia information on the errors since
the binary.errorlog is not receiving any of the information. I have
purged the file in the past and purged it about 2 weeks ago, only to
have it remain empty. I tried the information provided a few days
ago in the summary, and it still does not log any errors. I am
hoping that after the next boot-up (coming very soon) will solve the
problem.

5) any help in clarifying what needs to be done will be greatly
appreciated. I want to make sure that I make no mistakes in
trying to solve this problem.

Thanks in Advance,

Ron Bowman
Techno-Sciences, Inc.
rdbowma_at_tsi.clemson.edu
864-646-4028

Alpha EB 21164, 333MHz, 1 CPU
DU 4.0B (564) Patch #6 installed
Searchable Archive URLs:
http://www-archive.ornl.gov:8000/ (simple search)
http://www-archive.ornl.gov:8000/archive/power.htm (more detailed)
The following is a summary only site graciously maintained by Matt Moore.
http://www.bucks.edu/alpha-osf-managers/
Received on Tue Jun 09 1998 - 00:57:29 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:37 NZDT