SUMMARY: New cluster won't boot (ES40 / HSG60 / fibre channel)

From: Aldridge, Robert E. \(Rob\) <"Aldridge,>
Date: Mon, 23 Apr 2001 08:55:53 -0500

Tru64 Managers:


The key to fixing this problem was to apply patch kit #2 BEFORE BUILDING THE
CLUSTER.
[Thanks to Larry Clegg and others.] [No thanks to Compaq Support; I
specifically mentioned
that the system had not been patched, but this did not "raise any flags" as
it should have.]

Note: Patch kit 3 is now available on the Compaq web site.



OTHER suggestions that didn't help in my case, but might help in your
situation:

- Verify SCSI settings if 2 systems are connected to the shared SCSI bus.
- On the AlphaServer console, use wwidmgr and set the member boot device.
- Make sure firmware is at least 5.8. Some reports of problems with 5.9
firmware (but we are running 5.9 -24 with NO problems).
- Wipe out StorageWorks shared disks with 'disklabel -z'
- Re-install Tru64 and try clu_create again.
- Boot the member with genvmunix and apply the licensing information.
- Boot the system with the original system disk; then:
        - mount cluster_root#root on /mnt
        - mount root1_domain#root on /mnt2. Verify that:
        - /mnt has a CDSL for vmunix
        - /mnt2 has a real vmunix file
        - /mnt2/etc/sysconfigtab exists
- Check that HSG connection type set to 'TRU64 UNIX' and *not* NT.



For what it's worth -- what "threw me off" in the first place was
mis-reading the cluster installation manual. I thought the manual said
something about not patching before building the cluster. What it said was
to: install layered products, apply patches, then create the cluster. (If
layered products aren't installed before patching, those products will not
be automatically patched when they are installed later on.)



-----Original Message-----
From: Aldridge, Robert E. (Rob) [mailto:realdridge_at_mcdermott.com]
Sent: Thursday, April 19, 2001 12:13 PM
To: 'tru64-unix-managers_at_ornl.gov'
Subject: New cluster won't boot (ES40 / HSG60 / fibre channel)



Tru64 Managers,

We have two brand-new ES40 systems, memory channel interconnect, with HSG60
controllers to disks, running Tru64 5.1.

We've installed TruCluster 5.1, and successfully went through clu_create
with no errors.

However, on the first reboot of the single-node cluster we have problems:

Problem 1:
The system can't find 'vmunix' and 'sysconfigtab.34' and prompts for another
kernel to use

Problem 2:
When specifying 'genvmunix' for the alternate kernel, the cluster is built
(one member), then the quorum disk is mounted, but then we get this error
message:
        "cfs_mountroot_local failed to boot the cluster root fs with error =
22"

There is a subsequent message from CAM saying that the SCSI Bus has been
reset.



For what it's worth, here is our this disk arrangement:
- Tru64 5.1 was installed "fresh" to the ES40 internal dsk1.
- Using the HSG partitioning, one 36 GB drive in the MA6000 array was carved
up into 4 disks for Tru64/TruCluster to use:
        dsk3 - cluster root
        dsk4 - member 1 root
        dsk5 - member 2 root
        dsk6 - quorum


Thanks for any insights you can provide. I also have logged a call with
Compaq TruCluster support.



Rob Aldridge
realdridge_at_att.com
AT&T Solutions
Received on Mon Apr 23 2001 - 14:03:50 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:42 NZDT