Didn't get a good resolution to this, but at the suggestion of Compaq ("The
New HP") Support, I set the unit_offset on the HSG connections for the UNIX
systems to 30. This put the units well above the units for the NT systems.
WWIDMGR then worked properly.
Support thinks that the zoning on the fiber switches or host access on the
HSGs was not properly configured originally for the NT systems since some
of the WWIDs I was seeing with "wwidmgr -show wwid" were from units which
should only have been accessible by the NT systems. This was strange since
UNIX did not see them when I booted from the local unix disk to build the
cluster.
But it got me over the hump and the cluster is now built.
Original Question:
-----------------------------------------------------
Hello Managers,
While in the process of building a 5.1a cluster, I've encountered an odd
problem which appears to be with wwidmgr.
Some background:
The cluster will eventually consist of 2 DS20s with two KGPSAs each , SRM
firmware 6.1 ( HBAs are in the two lowest PCI slots)
2 HSG 80 (firmware 8.6) in multibus failover mode with 2 fabric switches (
Compaq ). Each HSG is connected to an individual switch only via HSG port 1
( port 2 is not connected to anything). Each HBA in the DS20s connects to
an individual switch. The connections on the HSG for the DS20 HBAs are set
to Tru64 UNIX operating system. The SAN is currently in use for some
Windows NT systems, so a reboot is not possible.
For the present, only one DS20 is connected as the other is still a
non-cluster production system.
The switches are zoned. One zone for 4 Windows NT systems which includes
the HSG and an MDR. The other zone is the UNIX system zone which for now
contains only the one DS20, the HSG, and the MDR.
I've created 9 units on the HSG, set the identifiers to match the unit
numbers, and set the enable_access to allow only the two connections to the
DS20. Everything looks good at this point.
On the DS20, I used wwidmgr to set each adapters topology to fabric. a
"wwidmgr -show adapter" confirms the topology is set to fabric.
Here's the quirky part...
When I use wwidmgr, I get messages saying "pga0 not ready" and " pgb0 not
ready" ( twice each)( I've confirmed the LEDs, cables, etc)
When I do a "wwidmgr -show wwid" I see some odd WWID numbers which do not
directly relate to the WWID indicated by a "show unit" on the HSG. Also,
the UDID only shows up for two of the units, the others all show 0.
I have tried "wwidmgr -clear all" with the same results.
I went ahead and loaded 5.1a on the internal disk to begin the cluster load
anyway, hoping things may clear up.
"hwmgr -view devices" showed me all the HSG units including the UDID. I
was able to disklabel them all. I went ahead and built the cluster and
everything worked fine as far as creating the cluster system disk, the
quorum disk, and the member boot disk.
However, after the cluster install procedure indicated it had set the
console variable ( bootdef_dev, etc ) and rebooted, the system stopped at
the SRM prompt.
The SRM variable for bootdef_dev was cleared ( empty ). I did a wwidmgr
-show wwid and picked the WWID that most closely matched the WWID of the
unit I wanted to set as my member boot disk and used "wwidmgr -quickset
-item # -unit #" to set it. A "show dev " showed me 2 dga devices and 2
dgb devices with long strings of numbers after them. I used "set
bootdef_dev" to set the boot devices to the dga and dgb devices.
When I went to boot, I received the usual message about doing an init,
which I did.
However, after the init, when I did a "show bootdef_dev" it was again
empty and a "show dev" showed no dga or dgb devices.
I have repeated used wwidmgr to set, clear, etc and set and reset the boot
device. Each time I do an INIT, I lose all the settings.
I have a call open to Compaq, but they are also baffled.
Has anyone experienced anything like this, or does anyone have any suggestions.
Thanks.
John
Received on Mon Jun 03 2002 - 21:32:34 NZST