o/s recovery dunix v Solaris

From: <anthony.miller_at_vf.vodafone.co.uk>
Date: Sat, 21 Apr 2001 10:34:27 +0100

All...

This is more Solaris related so please excuse the posting here. However,
the procedure is basically that used on dunix systems so I hope somebody may
be able to help out.


All...

I hope that you can help me with an o/s recovery problem that I have. I
wrote ages ago
a procedure for Digital UNIX which takes you thru a full o/s recovery
from a set of dump tapes. The systems concerned all had mirrored o/s disks
via LSM
(rebadged Veritas volume manager). The recovery was only for the o/s (ie.,
everything in
rootdg disk group). Applications restore is covered seperately. This works
very well
and reliably across all of the V4.0 Digital-Unix streams.

We needed a Solaris equivalent of this document for systems with mirrored
VxVm o/s disks
so I did a basic conversion job for Solaris. I have put a lot of effort
into this and it
works well - almost...

I have tested it on a SUN-6500 running Solaris 7 and on an E10K domain
running Solaris 7
also. No problems at all, worked like a dream. I have a weird problem
however testing
on an E10K domain running Solaris-6.

The basic process is as follows (loads of detail left out for obvious
reasons):
0. Make sure boot-devise console variable set properly.
1. "boot net -s"; use FORMAT to partition the relevant disk etc.
2. Make some temporary mount points for /, /opt etc (/tmp/a/root,
/tmp/a/opt,...)
    NEWFS my just partitioned slices and mount them.
3. UFSRESTORE the dump into each mount point; rm RESTORESYMTABLE; umount &
fsck; then
    remount them all.
4. Install boot blocks
5. Modify "/tmp/a/root/etc/system" file and remove the lines
"rootdev:/pseudo/vxio_at_0:0"
    and "set vxio:vol_rootdev_is_volume=1" lines to disable VxVm.
6. Modify "/tmp/a/root/etc/vfstab" and change all "/dev/vx/dsk/volume"
lines to the
    conventional "/dev/dsk/c0t0d0sn" format
7. Temporarily supress starting of our applications by renaming the
relevant rc3.d & rc2.d
    "S" scripts
8. umount the recovered file systems
9. "bringup -A off", "limit-ecache-size" & "boot -s"

What should happen at this point is that you give the root password before
continuing
to:
A. delete the device trees (/dev/dsk, /dev/rdsk, /dev/rmt)
B. recreate device trees to match existing hardware ("devlinks -r .",
"disks -r .",
    "tapes -r .")
C. delete the /etc/path_to_inst file
D. reboot using "reboot -- -ras"
E. re-encapsulate o/s disks, mirror up, recover secondary swap, etc., etc.

As I say, on Solaris-7 (6500 and E10K domain) this works fine. However on
Solaris 6 (e10k)
I get to point 9 above just fine, but the system fails to get me properly to
single user
mode and leaves the root file system mounted "read-only" suggesting I fsck
it. However,
I cant fsck it because its read only. I have tried mounting rw
"mount -F ufs -o rw /dev/dsk/c0t0d0s0" but the mount point is busy
(obviously).

On a Digital unix system, you can do a "mount -u /" which takes read only
root and
mounts it rw, but I cant see an equivalent here for Solaris"

No matter what I do, I cant get past this point. I can shut down, "boot net
-s" again,
mount the file systems, fsck them, dismount gracefully etc, but I always end
up back
here again.

Does anybody have any ideas please? I have attached a brief transcript of
part of the
session in case it helps anybody.

Many thanks : Tony

PS., if anybody wants a copy of either the dunix or the Solaris document
then let me know.


blahblah-ssp1:pluto% bringup -A off
Trying to get bringup.lock lock... OK
Starting netcon_server -p 8 ... OK

netman-ssp1:pluto% netcon
trying to connect...

SUNW,Ultra-Enterprise-10000, using Network Console
OpenBoot 3.2.131, 4096 MB memory installed, Serial #blahblah.
Ethernet address blahblah, Host ID: blahblah.

<#8> ok printenv boot-device
boot-device = vx-rootdisk vx-rootmirr net
<#8> ok devalias
vx-rootmirr /sbus_at_4c,0/QLGC,isp_at_0,10000/sd_at_0,0:a
vx-rootdisk /sbus_at_48,0/QLGC,isp_at_0,10000/sd_at_0,0:a
net /sbus_at_49,0/SUNW,qfe_at_0,8c30000

<#8> ok limit-ecache-size
<#8> ok boot -s
Boot device: /sbus_at_48,0/QLGC,isp_at_0,10000/sd_at_0,0:a File and args: -s

SunOS Release 5.6 Version Generic_105181-22 [UNIX(R) System V Release 4.0]
Copyright (c) 1983-1997, Sun Microsystems, Inc.
/kernel/drv/fcaw symbol ddi_model_convert_from multiply defined
fcaw4: JNI Fibre Channel Adapter model FCW
fcaw4: 64-bit SBus 1: IRQ 3: FCODE Version 13.3.7 [18c932]: SCSI ID 125:
AL_PA 0
1
fcaw4: Fibre Channel WWNN: 100000e0694126e7 WWPN: 200000e0694126e7
fcaw4: FCA SCSI/IP Driver Version 2.4.1.EMC, March 7, 2000 for Solaris
2.5,2.6
fcaw4: All Rights Reserved.
fcaw4: LINK DOWN: Check Connections...
fcaw1: JNI Fibre Channel Adapter model FCW
fcaw1: 64-bit SBus 1: IRQ 3: FCODE Version 13.3.7 [18c932]: SCSI ID 125:
AL_PA 0
1
fcaw1: Fibre Channel WWNN: 100000e069412b45 WWPN: 200000e069412b45
fcaw1: FCA SCSI/IP Driver Version 2.4.1.EMC, March 7, 2000 for Solaris
2.5,2.6
fcaw1: All Rights Reserved.
fcaw1: LINK DOWN: Check Connections...
fcaw5: JNI Fibre Channel Adapter model FCW
fcaw5: 64-bit SBus 1: IRQ 3: FCODE Version 13.3.7 [18c932]: SCSI ID 125:
AL_PA 0
1
fcaw5: Fibre Channel WWNN: 100000e069413b4f WWPN: 200000e069413b4f
fcaw5: FCA SCSI/IP Driver Version 2.4.1.EMC, March 7, 2000 for Solaris
2.5,2.6
fcaw5: All Rights Reserved.
fcaw5: LINK DOWN: Check Connections...
fcaw3: JNI Fibre Channel Adapter model FCW
fcaw3: 64-bit SBus 1: IRQ 3: FCODE Version 13.3.7 [18c932]: SCSI ID 125:
AL_PA 0
1
fcaw3: Fibre Channel WWNN: 100000e0694137f9 WWPN: 200000e0694137f9
fcaw3: FCA SCSI/IP Driver Version 2.4.1.EMC, March 7, 2000 for Solaris
2.5,2.6
fcaw3: All Rights Reserved.
fcaw3: LINK DOWN: Check Connections...
WARNING: forceload of drv/ssd failed
WARNING: forceload of drv/sf failed
WARNING: forceload of drv/pln failed
WARNING: forceload of drv/soc failed
WARNING: forceload of drv/socal failed
starting Network Console
Hostname: nike
operation failed, Invalid argument
operation failed, Invalid argument
operation failed, Invalid argument
SUNW,qfe27: Transciever speed set incorrectly.
WARNING: SUNW,qfe27: Unable to reset transciever.
operation failed, Invalid argument
operation failed, Invalid argument
operation failed, Invalid argument
operation failed, Invalid argument
operation failed, Invalid argument
/dev/dsk/c0t0d0s1: No such device or address
The / file system (/dev/rdsk/c0t0d0s0) is being checked.
can't open /etc/mnttab
Can't open /dev/rdsk/c0t0d0s0
/dev/rdsk/c0t0d0s0: CAN'T CHECK FILE SYSTEM.
/dev/rdsk/c0t0d0s0: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.

WARNING - Unable to repair the / filesystem. Run fsck
manually (fsck -F ufs /dev/rdsk/c0t0d0s0). Exit the shell when
done to continue the boot process.


Type Ctrl-d to proceed with normal startup,
(or give root password for system maintenance):
Entering System Maintenance Mode

blah_at_ROOT# df -k
df: failed to open /etc/mnttab: No such file or directory

blah_at_ROOT# format
Searching for disks...EMCPOWER: Multipathing Driver Loading...
done
No disks found!

nike_at_ROOT# fsck -F ufs /dev/rdsk/c0t0d0s0
can't open /etc/mnttab
Can't open /dev/rdsk/c0t0d0s0

nike_at_ROOT# rm -rf /dev/dsk
rm: Unable to remove directory /dev/dsk: Read-only file system

nike_at_ROOT# ^D
resuming system initialization
mount: /dev/dsk/c0t0d0s0 no such device
/sbin/rcS: /etc/dfs/sharetab: cannot create
The /var file system (/dev/rdsk/c0t0d0s3) is being checked.
can't open /etc/mnttab
Can't open /dev/rdsk/c0t0d0s3
/dev/rdsk/c0t0d0s3: CAN'T CHECK FILE SYSTEM.
/dev/rdsk/c0t0d0s3: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.

WARNING - Unable to repair the /var filesystem. Run fsck
manually (fsck -F ufs /dev/rdsk/c0t0d0s3). Exit the shell when
done to continue the boot process.


Type Ctrl-d to proceed with normal startup,
(or give root password for system maintenance):
resuming system initialization
mount: /dev/dsk/c0t0d0s3 no such device
setmnt: Cannot open /etc/mnttab for writing
INIT: Cannot create /var/adm/utmp or /var/adm/utmpx
INIT: failed write of utmpx entry:" "
INIT: failed write of utmpx entry:" "
INIT: SINGLE USER MODE

Type Ctrl-d to proceed with normal startup,
(or give root password for system maintenance):
Entering System Maintenance Mode

blah_at_ROOT# df -k
df: failed to open /etc/mnttab: No such file or directory
nike_at_ROOT# fsck -F ufs /dev/rdsk/c0t0d0s0
can't open /etc/mnttab
Can't open /dev/rdsk/c0t0d0s0
nike_at_ROOT# rm -rf /dev/dsk
rm: Unable to remove directory /dev/dsk: Read-only file system
Received on Sat Apr 21 2001 - 09:36:39 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:42 NZDT