Tru64 Managers:
The key to fixing this problem was to apply patch kit #2 BEFORE BUILDING THE
CLUSTER.
[Thanks to Larry Clegg and others.] [No thanks to Compaq Support; I
specifically mentioned
that the system had not been patched, but this did not "raise any flags" as
it should have.]
Note: Patch kit 3 is now available on the Compaq web site.
OTHER suggestions that didn't help in my case, but might help in your
situation:
- Verify SCSI settings if 2 systems are connected to the shared SCSI bus.
- On the AlphaServer console, use wwidmgr and set the member boot device.
- Make sure firmware is at least 5.8. Some reports of problems with 5.9
firmware (but we are running 5.9 -24 with NO problems).
- Wipe out StorageWorks shared disks with 'disklabel -z'
- Re-install Tru64 and try clu_create again.
- Boot the member with genvmunix and apply the licensing information.
- Boot the system with the original system disk; then:
- mount cluster_root#root on /mnt
- mount root1_domain#root on /mnt2. Verify that:
- /mnt has a CDSL for vmunix
- /mnt2 has a real vmunix file
- /mnt2/etc/sysconfigtab exists
- Check that HSG connection type set to 'TRU64 UNIX' and *not* NT.
For what it's worth -- what "threw me off" in the first place was
mis-reading the cluster installation manual. I thought the manual said
something about not patching before building the cluster. What it said was
to: install layered products, apply patches, then create the cluster. (If
layered products aren't installed before patching, those products will not
be automatically patched when they are installed later on.)
-----Original Message-----
From: Aldridge, Robert E. (Rob) [mailto:realdridge_at_mcdermott.com]
Sent: Thursday, April 19, 2001 12:13 PM
To: 'tru64-unix-managers_at_ornl.gov'
Subject: New cluster won't boot (ES40 / HSG60 / fibre channel)
Tru64 Managers,
We have two brand-new ES40 systems, memory channel interconnect, with HSG60
controllers to disks, running Tru64 5.1.
We've installed TruCluster 5.1, and successfully went through clu_create
with no errors.
However, on the first reboot of the single-node cluster we have problems:
Problem 1:
The system can't find 'vmunix' and 'sysconfigtab.34' and prompts for another
kernel to use
Problem 2:
When specifying 'genvmunix' for the alternate kernel, the cluster is built
(one member), then the quorum disk is mounted, but then we get this error
message:
"cfs_mountroot_local failed to boot the cluster root fs with error =
22"
There is a subsequent message from CAM saying that the SCSI Bus has been
reset.
For what it's worth, here is our this disk arrangement:
- Tru64 5.1 was installed "fresh" to the ES40 internal dsk1.
- Using the HSG partitioning, one 36 GB drive in the MA6000 array was carved
up into 4 disks for Tru64/TruCluster to use:
dsk3 - cluster root
dsk4 - member 1 root
dsk5 - member 2 root
dsk6 - quorum
Thanks for any insights you can provide. I also have logged a call with
Compaq TruCluster support.
Rob Aldridge
realdridge_at_att.com
AT&T Solutions
Received on Mon Apr 23 2001 - 14:03:50 NZST