SUMMARY: Dead hard drive?

From: Rasana Atreya <atreya_at_library.ucsf.edu>
Date: Thu, 19 Dec 1996 10:30:07 -0800

Hi!

I asked (I admit a LONG time ago):
> One of the users' machine (An Alpha with Digital Unix 3.0, I think) froze
> with the messages:
>
> cdisk-check-sense
> Medium error bad block number: 3544722
> Hard error detected
> DEC R228M (c) DEC RZ 28M
> active CCB at time of error
> CCB request completed with an error
> Medium error - Non-recoverable meduim error
>
> Reboot hung.
>
> The disk contains the /home partition, so he tried to boot in single user
> mode to comment this out in /etc/fstab. It said this file was read-only, and
> it would not let him change it. Same thing when he tried booting from CDROM.


Thanks to everybody for their help. This list is awesome!

Rasana

**************
From: Cliff Krieger <ckrieger_at_latrade.com>

I am not sure exactly what the problem is with the disk. Regarding
single user mode. When you boot, the root file system is mounted
read-only. Only after it passes an integrity check is it remounted
read-writeable. Normally you would execute /sbin/bcheckrc, but since
that mounts all the file systems in /etc/fstab, you have to do it
manually. Use mount -u / to remount the root file system read-writable.

**************
From: Gyula Szokoly <szgyula_at_skysrv.Pha.Jhu.EDU>

  Did he mount '/' read-write? Single user comes up with read-only,
you have to do a 'mount -u /'. After this, you can edit fstab.
  As for the disk, he can do a:

# scu -f /dev/rrz25c verify media

to see if he has a few bad sectors, which can be cured (not without
dataloss, though) by the 'reassign' feature in 'scu'.
In the example above I used drive 25 (fourth bus, SCSI id=1) and the whole
disk (partition c). If this fails, you can address the drive with the
bus/drive/lun nexus, too inside scu.
  As I recall the verify media command is reading only, but not
100% sure so check it (and it ties down the whole I/O system, so
warn users). I couldn't do this so I tested in 50 000 block chunks
(each of these run took 5-10 secs):

# scu -f /dev/rrz25c verify media starting 0 length 50000

**************
From: alan_at_nabeth.cxo.dec.com (Alan Rollow - Dr. File System's Home for Wayward Inodes.)

The disk with the complaint probably has nothing more than a bad
block, though it may be a location that makes it worse than
usual. When booted, the root file system is read-only. You
can mount it read-write using the command "mount -u /". With
this done you can change /etc/fstab or go to work trying to
repair the disk.

For any file system, you'll want to verify the media of the
disk and replace any marginal or bad blocks. You can use
the scu(8) for this. It will try really hard to read the
correct data, but you may be forced to reassign a block
without having good data to put in the new block. For these
the repair is file system dependent.

For UFS, use icheck with the -b option to see where the damage
is and from that you can figure out how to fix the problems.
For file system metadata (superblocks, cylinder group block,
etc) fsck can fix most things. For file data, you'll want to
restore the file from a backup. Icheck will translate block
numbers to inodes numbers for file data. With those you can
use ncheck to translate back to file names.

If the particular file system doesn't begin at sector 0 of the
disk, keep in mind that the LBNs reported by uerf are relative
to the beginning of the disk. You may need to adjust for the
offset when feeding block numbers to icheck, which wants offsets
from the beginning of the partition.

For AdvFS, the repair tools are rather lacking. Before V4.0 you're
probably best off recreating the domain and restoring from a backup.
In V4.0 some of the tools that do exist may be documented.

**************
From: Dr Marco Luchini <m.luchini_at_ic.ac.uk>

Bad blocks from the disk. The disk is dying.

He did the right thing. There's a special option to mount that enables
you to remount it read write.

# mount -u /

If you ever have to do it on SunOS, there it's

# mount -o remount /

**************
From: Nick Hill - RAL CISD Systems Group <NMH1_at_axprl1.rl.ac.uk>

The root filesystem is moutned read only in single user mode as you
discovered! Issue a mount -u / to remount ir read write.

**************
From: rioux_at_ip6480nl.ce.utexas.edu (Tom Rioux)

you have to remount the root partition read/write:
mount -u /

**************
From: Ed Meirose <em_at_unx.dec.com>

Rasana,

I'm not sure about the disk

but you can do a

mount -u /

to mount root read/write to make changes to the /etc/fstab

**************
From: Matt Moore <moore_at_shemp.bucks.edu>

When the system boots in single-user it mounts / as read-only. You have
to remount as read-write with the mount -u command. Then you can edit
the /etc/fstab. BTW - what does fsck do?

**************
From: "Robert L. Urban" <urban_at_rto.dec.com>

If you are sure that no system partition (/, /usr, /var, swap) was
on the broken hard drive, you can do the following:

boot in single-user mode:
>>> boot -fl s
at the '#' prompt, fsck and mount all system partitions:
        cat /etc/fstab (see what the partitions are...)
        fsck /
        fsck /usr
        fsck /var (if /var on separate partition)
        mount -u / (re-mount root partition in read-write mode)
        mount /usr
        mount /var (if separate)
now edit /etc/fstab:
        vi /etc/fstab
and comment out the line with the bad disk

NOTE: you only need to fsck and mount /usr and /var if you want
to use vi to edit /etc/fstab. if you know ed, you only needd
to fsck and mount the root.

when you are done with the above, type Control-D (^D) to continue
booting into multi-user mode.

P.S., you will have difficulty recovering the data on the disk, because
most programs terminate as soon as an unrecoverable error is generated.

**************
From: bouchard_l_at_decus.fr (Louis Bouchard - Bouygues Telecom)

        Once your booted in "Single-User", the root partition is always mounted
"Read-Only". All you have to do is "mount -u / " to allow write access to it.

**************
From: atoalu2_at_ato.abb.se (Andreas Lundgren)

The hard disk is dead... :-(

#mount -u /
#ed /etc/fstab

**************
From: Hellebo Knut <Knut.Hellebo_at_nho.hydro.com>

Do a 'mount -u /' to get a writeable root partition in singleuser mode.

**************
From: "Alan B. Scott" <star_at_staru1.livjm.ac.uk>

        Believe it or not have just had a similiar set of error messages on
our system due to a head crash on our /home disk. (Digital discribe it as
a head kiss !). The comment Nonrecoverable Mediun error is not a good
sign. I was able to boot the system into single user mode (>>> boot -fl s)
and run the scu - SCSI CAM Utility Program.
This utility implements the
  SCSI commands necessary for normal maintenance and diagnostics of SCSI
  peripherals and the CAM I/O subsystem.
 
At the # prompt type scu -f /dev/rrz<which ever device>
At the scu> prompt type show device
followed by scu>verify media

In our case scu reported several bad blocks before eventually crashing
out of the scu programme. Our disk was scrapped. Apparently if only one
or two blocks are faulty a workaround may be possible.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~ Rasana Atreya Voice: (415) 476-3623 ~
~ System Administrator Fax: (415) 476-4653 ~
~ Library & Ctr for Knowledge Mgmt, Univ. of California at San Francisco ~
~ 530 Parnassus Ave, Box 0840, San Francisco, CA 94143-0840 ~
~ Rasana.Atreya_at_library.ucsf.edu ~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Received on Thu Dec 19 1996 - 19:55:57 NZDT

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:47 NZDT