Summary: Adding a TL896 in a cluster environment

From: <anthony.miller_at_vf.vodafone.co.uk>
Date: Thu, 10 Jun 1999 15:51:41 +0100

I posted a question the other day (original posting at the end) re adding
extra controllers and a TL896 in a cluster environment.

I received two replies. I have included both in full:
alan_at_nabeth.cxo.dec.com
Arnold Sutter [Arnold.Sutter_at_udt.ch]


The eventual solution I employed was somewhat different from my posting.
Slightly unsupported but it worked a dream.

On node P, I did a 'MAKEDEV tznn' 4 times (where 'nn' was the device special
file of the last existing tape drive). This ensured that node P has 6 rmt
devices (all be it that 5 all point to the same physical device) and node Z
also had 6 devices (all different). This got me over the consistent naming
issue.

I manually edited the '/sys/conf/node_name' files with the details of my new
controllers, in advance of installing them. Rebuilt a new kernel online and
copied this up to / just prior to the shutdown.

I knew that the SCSI id's of the tape drives were all 0 (all 6 drives on
separate controllers), and, knowing the device names of each new controller,
I created the 6 new tz device special files (./MAKEDEV tznnn) on both
systems - again, in advance of the reboot. This gave me *mt6* thru *mt11*
in a consistent manner on both systems and pointing to the 6 new devices.

After the shutdown, the controllers were added to the predefined slots.
After the reboot, a 'file /dev/rmt*h' showed that all tape devices where
visible. A 'scu show edt | grep -i sequential' confirmed this.

The only thing which caught me out was that, at the >>> prompt, we could see
6 changers! This rather knocked us (me and 2 Compaq engineers) off course
for a few hours whilst we went down the blind alley of multiple drives and
multiple luns etc. When UNIX is booted however, scu showed only one Changer
visible. Having created the device special file for this, it worked a
treat. (It was suggested after the event that the reason the console sees 6
changers is a ghosting effect because each [of the 6 drives] also sees the
[one] Changer. If we had only 3 drives, we would have seen 3 changers at
the console).

I used the CLI version of MRU to load tapes from the robot into all 6
drives, did a tar of /tmp to each drive then confirmed I could read the
archive, then unloaded all drives back to the silo. Repeating this on both
nodes, one at a time (i.e.., one node in single user mode, the other
halted).

I created the tape service with the 6 devices and the Changer plus some
storage with no problems.

Now all we need to do is get the Legato problems sorted out - but that's a
different story.
Thanks everyone.


regards - Tony





===============================Alan nabeth================================
What software are you planning to use on a day to day basis?

It is worth noting that MRU never tested this configuration, in
large part because we knew the firmware of the TL896 that we had
at the time (the whole TL820 family actually) wasn't cluster
friendly. There may be a newer version that is, but we only
casually tested it and never in a multi-host environment. This
was around the time that the engineering team was laid off (me)
and development of MRU sent off-shore (India). The new group
doesn't have a TL820 family library to test this.

Networker may support the configuration with the right firmware,
but that group also got laid off and now Networker is all Legato's
responsiblity.

The likely risk on the firmware side is that a device reset from
one of the hosts will leave the library in an unpredictable state.
If you're using MRU, this won't be too much trouble (aside from
the command failure) since the CLI can reverse nearly any condition
the library can get itself into.

You'll also want to have tape drive firmware that gracefully
deals with resets.

As for tape drive naming... You don't need to use mknod to create
the special files; /dev/MAKEDEV can do that. If you need them done
in a particular order, simply give the device names in the desired
order. MAKEDEV doesn't reorder the names. I believe current
versions only create the special files on boot if the devices don't
already exist, so you're probably ok for now.


=======================arnold
sutter===============================================
Hi Tony,

# cd /dev
# ./MAKEDEV tz<xy>
where <xy> is SCSI-Bus Number * 8 + Target ID

This will get you the /dev/nrmt<z>h files where <z> is usually
the next available mt number. So if you have /dev/nrmt0h and
/dev/nrmt1h already it will create the /dev/nrmt2h .

Maybe you see the tz<xx> entry during startup of the cluster
members when everything is hooked up.

Good luck & Regards,

Arnold Sutter, UDT


=======================original
posting=============================================
Sorry, this is a long one.

Here is what I want to do
-------------------------
Tonight we are adding a TL896 to an existing 2 node cluster (8400's - DUNIX
V4.0D + patch 3) - I'll refer to them as nodes P & Z. The silo will
ultimately belong to a tape service - but that's not until after a long
night! Each 8400 has 6 * KZPSA's to be added with appropriate Y cables etc
to connect the whole thing up. Each KZPSA to go to its own tape drive on
the silo.

The controllers will be added to both nodes 'at the end' of the existing
controllers to avoid device name changes etc.

What I need to understand is tape special file names, how to create them in
a consistent manner and how to test the silo. What I want to do tonight is
to show that the tapes are visible and work on both sides and are consistent
on both sides. The service creation will be done tomorrow during the day.

Please, if you have any comments, please pass them back.

best regards - Tony

Quotation: "Is the glass half full or half empty?? ...
               Well, drink it anyhow, that's what I say".
  Pete Goss.

+-----------------------------------------------------------------+
| TONY MILLER - Systems Projects - VODAFONE LTD, Derby House, |
| Newbury Business Park, Newbury, Berkshire. |
+-------------+---------------------------------------------------+
| Phone | 01635-507687(local) |
| Work email | ANTHONY.MILLER_at_VF.VODAFONE.CO.UK |
| FAX | 01635-233517 |
+-------------+---------------------------------------------------+

Disclaimer: Opinions expressed in this mail are my own and do not
reflect the company view unless explicitly stated. The information
is provided on an 'as is' basis and no responsibility is accepted for
any system damage howsoever caused.





Here is the relevant config on both nodes
-----------------------------------------
Node P already has 2 locally attached tape drives:

p>ls -l /dev/rmt*h
crw-rw-rw- 1 root system 9,199682 Sep 15 1998 /dev/rmt0h
crw-rw-rw- 1 root system 9,201730 Jan 28 13:50 /dev/rmt1h

p>file /dev/rmt*h
/dev/rmt0h: character special (9/199682) SCSI #12 TZ88 tape #327 (SCSI
ID #3
) (SCSI LUN #0) offline
/dev/rmt1h: character special (9/201730) SCSI #12 TZ887 tape #328 (SCSI
ID #
5) (SCSI LUN #0) loader 81630_bpi

p>scu show edt | grep -i sequential
    Device: TZ88 Bus: 12, Target: 3, Lun: 0, Type: Sequential Access
    Device: TZ887 Bus: 12, Target: 5, Lun: 0, Type: Sequential Access


Node Z already has 6 locally attached tape drives:
z>ls -l /dev/rmt*h
crw-rw-rw- 1 root system 9, 35842 Sep 14 1998 /dev/rmt0h
crw-rw-rw- 1 root system 9, 37890 Sep 14 1998 /dev/rmt1h
crw-rw-rw- 1 root system 9,199682 Feb 27 14:14 /dev/rmt2h
crw-rw-rw- 1 root system 9,201730 Sep 14 1998 /dev/rmt3h
crw-rw-rw- 1 root system 9,265218 Sep 14 1998 /dev/rmt4h
crw-rw-rw- 1 root system 9,267266 Feb 27 14:08 /dev/rmt5h

z>file /dev/rmt*h
/dev/rmt0h: character special (9/35842) SCSI #2 TZ88 tape #6 (SCSI ID
#3) (S
CSI LUN #0) offline
/dev/rmt1h: character special (9/37890) SCSI #2 TZ887 tape #7 (SCSI ID
#5) (
SCSI LUN #0) loader 81630_bpi
/dev/rmt2h: character special (9/199682) SCSI #12 TZ88 tape #330 (SCSI
ID #3
) (SCSI LUN #0) offline
/dev/rmt3h: character special (9/201730) SCSI #12 TZ887 tape #331 (SCSI
ID #
5) (SCSI LUN #0) loader 81630_bpi
/dev/rmt4h: character special (9/265218) SCSI #16 TZ88 tape #453 (SCSI
ID #3
) (SCSI LUN #0) offline
/dev/rmt5h: character special (9/267266) SCSI #16 TZ887 tape #454 (SCSI
ID #
5) (SCSI LUN #0) loader 81630_bpi

z>scu show edt | grep -i sequential
    Device: TZ88 Bus: 2, Target: 3, Lun: 0, Type: Sequential Access
    Device: TZ887 Bus: 2, Target: 5, Lun: 0, Type: Sequential Access
    Device: TZ88 Bus: 12, Target: 3, Lun: 0, Type: Sequential Access
    Device: TZ887 Bus: 12, Target: 5, Lun: 0, Type: Sequential Access
    Device: TZ88 Bus: 16, Target: 3, Lun: 0, Type: Sequential Access
    Device: TZ887 Bus: 16, Target: 5, Lun: 0, Type: Sequential Access


For service reasons, node P will be left online whilst node Z is taken down
and has its controllers installed, the silo connected and tested etc. When
we are happy, node P will also be taken down and connected to the silo and
the tape drives tested again. Then both nodes will be brought online.

Node Z will be booted single user from genvmunix after the installation of
the controllers, and a new kernel will be built then reboot from the new
kernel. I guess UNIX will automatically create me 6 new device special
files (rmt6* thru rmt11*).


Now, when we do node P and reboot after the kernel rebuild, I'll have 6 new
device special files created but they will be rmt2* thru rmt7* - i.e.., not
the same as node Z. This will stop me creating the tape service as the
device special files will NOT be the same on both nodes. To be consistent on
both sides, here is what I plan to do:


Node Z
------
1. Remove the new device special files (nrmt6*/rmt6* thru nrmt11*/rmt11*)
which were automatically created.

2. do a 'scu show edt | grep -i sequential' and note the bus, target and
LUN of each new tape drive.

3. The major number (I assume) is 9. Calculate the minor number for each
drive as follows:
Minor=(16384*bus)+(1024*target)+(64*lun).

4. Create the device special files:
   mknod /dev/rmt6l c 9 minor_number
   mknod /dev/nrmt6l c 9 minor_number+1
   mknod /dev/rmt6h c 9 minor_number+2
   mknod /dev/nrmt6h c 9 minor_number+3
   mknod /dev/rmt6n c 9 minor_number+4
   mknod /dev/nrmt6n c 9 minor_number+5
   mknod /dev/rmt6a c 9 minor_number+6
   mknod /dev/nrmt6a c 9 minor_number+7

5. Do a 'file /dev/rmt*h' to verify the drives are visible etc.

6. Load MRU (media robot utility) to permit me to manipulate the robot to
load tapes into each drive.

7. Test each drive works via a simple 'tar' command.


Node P
------
1. Remove the new device special files (nrmt2*/rmt2* thru nrmt7*/rmt7*)
which were automatically created.

2. do a 'scu show edt | grep -i sequential' and note the bus, target and
LUN of each new tape drive (should be the same as for node Z).

3. The major number (I assume) is 9. Calculate the minor number for each
drive as above.

4. Create the device special files as above.

5. Do a 'file /dev/rmt*h' to verify the drives are visible etc.

6. Load MRU (media robot utility) to permit me to manipulate the robot to
load tapes into each drive.

7. Test each drive works via a simple 'tar' command.
Received on Thu Jun 10 1999 - 14:54:29 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:39 NZDT