HSZ70 Trans Failover Config Q's

From: Tom Webster <webster_at_ssdpdc.lgb.cal.boeing.com>
Date: Thu, 04 Sep 2003 17:15:41 -0700

Hi,

I've been seeing some weird behavior and just wanted to make sure I'm
not missing something simple.....

What I'm trying to do do is rebuild my TruCluster test cluster up from
scratch. We were using an RA3000 array, but we needed more space, so
we are switching to a pair of HSZ70s in a single BA370. The BA370 was
originally equipped with HSZ70s and later upgraded to HSG80s. We have
upgraded to newer shelves with the HSG80s -- freeing the BA370 to use
with the old HSZ70s. Clear as mud right?

Servers:
   AS4000 (6.3 FW)
      KZPSA-BB x 2
      KZPBA-CB x 1
      DE500 x 1
      MC 1.5 x 1
   AS4000 (6.3 FW)
      KZPSA-BB x 2
      KZPBA-CB x 1
      DE500 x 1
      MC 1.5 x 1

Storage:
   BA370
   HSZ70 (V77Z FW) x 2
   128MB HSZ Cache Module x 2
   Dual Cache Battery
   23 x 36GB Drives

One would think that this should be pretty straight forward. When we
plugged the HSZ70s in, they came up with their old configuration.
First, we cleared the units, sets, and disks and then added the new
disks in and built the back end storage up the way we wanted. This
seemed to work fine -- until we hooked it up to a server and tried to
work with it. The devices were visible from the SRM console, but when
we tried to boot up off of the Tru64 Install CD (5.1 and 5.1B) we
started getting device not ready errors.

After a lot of debugging, which has included:

* Connecting to the array using both the KZPSA and KZPBA controllers.
* Connecting one or both servers to the bus.
* Replacing ALL of the SCSI cables.
* Replacing the controllers with another set off of the shelf.
* Re-Initing the controllers to defaults.
* Re-Initing and running with the 7.3 firmware for a time.

What we are seeing:

With one controller:
   SCSI bus is terminated on one side of the tri-link.
   The other side is connected to the AS4000's KZPBA-CB.
   Host mode is set to "A"
   Busses are 1,2,3,4

   Everything appears fine. The AS4000 can see all volumes and can
   install Tru64 5.1B without problems.

With two controllers:
   SCSI bus is as follows:
      Bottom controller, trilink is terminated on one side, other is
      connected to BN37-0E cable to other controller.
      Top controller, trilink is connected to AS4000, other side is
      connected to BN37-0E cable to other controller.
   Second controller and cache added using FRUTIL
   Failover set to transparent mode (FAIL COPY=THIS)
   Host mode is set to "A"
   Busses are 1,2,3,4
   Preferred busses are (1,3) (top) and (2,4) (bottom).

   Devices access is messed up. Devices on bottom controller can't be
   accessed. Devices are visible from SRM console, but show hard
   errors at OS level. Failing all drives to the top controller and
   disconnecting the bottom controller's trilink seems to resolve the
   problem. It's like the two controllers on the bus stomp on each
   other.

What the heck am I doing wrong? The BN37-0E is the recommended
interconnect cable, and we have tried a couple -- along with two
different sets of VHCDI tri-links (not to mention two sets of
controllers).

We have also tried contiguous preferred IDs (1,2) and (3,4) with no
better luck.

At this point, I haven't tried multi-bus failover, but I'd have a hard
time trusting it if I can't get transparent working.

It's been a couple years since I've had to configure HSZ*0's but I
remember the 70s as being fairly painless -- not having the license
code issues of the 40s and 50s.

Any thoughts?

Tom
-- 
+-----------------------------------+---------------------------------+
| Tom Webster                       |  "Funny, I've never seen it     |
| SysAdmin MDA-SSD ISS-IS-HB-S&O    |   do THAT before...."           |
| webster_at_ssdpdc.lgb.cal.boeing.com |   - Any user support person     |
+-----------------------------------+---------------------------------+
|      Unless clearly stated otherwise, all opinions are my own.      |
+---------------------------------------------------------------------+
Received on Fri Sep 05 2003 - 00:20:13 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:44 NZDT