Summary: Weird LSM/HSZ70 problem

From: <anthony.miller_at_vf.vodafone.co.uk>
Date: Thu, 20 May 1999 08:39:30 +0100

All...

Re my query yesterday re weird LSM/HSZ situation (original posting at the
end). I received replies from:

rye_at_jwfc.acom.mil
Andrew Busch [a.busch_at_qut.edu.au]
Davis [Davis_at_Tessco.Com]
alan_at_nabeth.cxo.dec.com


Many thanks to everybody who replied.

The suggestion about checking writeback cache on the units has caught me out
in the past...but it wasnt the problem in this case. All units were set to
WRITEBACK_CACHE.

I had deleted all units, stripe sets and all disks before swapping them (and
running config).
The reason we are using LSM mirroring rather than hardware mirroring is so
that we can
mirror across controllers (rather than mirror on the same controller) to
gain that bit
of extra functionality/resilience. Also, my stripe sets are created across
different channels.

The information from Alan Davis and alan_at_nabeth.cxo.dec.com seems to (I
hope) hit the nail on the head. I did swap the disks 'on line' as it were.
We are going to go for a reboot. Its a production machine so I have to
schedule it. If the reboot resolves the issue I will post an updated
summary.

I hadnt actually checked the error log (obvious that I should I know). I'm
doing this now.


---alan_at_nabeth.cxo.dec.com--------------------------------------------------
---------------
Looking around I notice that the symtom was observed at least
once before internally, but no cause or solution was ever
given. My own experience with LSM has been that it seems to
more data structures internally than it has a right to and
it is hard to get it to forget what used to know. For example,
if you use a device name with LSM and a smaller device, it
may keep around that old disk partition table. When you try
to reuse the name with a larger device it may be using the
smaller size gets the write error doing its own boundary
checking. A reboot fixes it because the information is in
memory. I think there is also a command to completely delete
the disk from the configuration, which clears the old information.

Are there any errors in the error log? The speed problem could
just be the system being busy or it could be a sign of something
wrong on that particular SCSI bus.
----------------------------------------------------------------------------
----------------


---Davis
[Davis_at_Tessco.Com]----------------------------------------------------------
------
Have you rebooted? There is/was a problem last year that had similar,
although not identical symptoms. It was very hard to reproduce and we never
did get a definitive answer from engineering on it. I had a vfs locking
patch that I had one customer that was trying to reproduce it try out, but I
didn't hear back from them so I'm not sure if it worked of if they just
didn't hit the conditions that triggered the problem again before I left
Compaq. To give you an idea of how seldom this occurred I know of 4 cases in
all of 1998, none were reproducible either by the customer or in the support
lab.

 If the problem goes away after reboot, you may find that it doesn't
re-occur unless you re-organize the disks again while they are all online.
We think there was a race condition in an IOCTL call that would make one
partition, usually G, read-only. This caused the diskadd to fail. Accessing
the G partition for write outside of LSM would also fail, although the rest
of the disk was accessible. A reboot was the only way we found to clear it.

 This may not be your case, but it's an idea.

Alan Davis
----------------------------------------------------------------------------
----------------


---Andrew Busch
[a.busch_at_qut.edu.au]--------------------------------------------------
The only thing I can think of is this - did you delete all the drives from
the HSZ70 before using the "run config" command (on the HSZ70's, that is)
to redo the configuration? It took me a while to work out that the config
utility only scans the cabinet positions that HAVEN'T already been given a
name.. my advice - delete EVERYTHING from the HSZ70 and start again.

Another thing - is there a reason you're running LSM on top of the
hardware raid supplied by the HSZ70's? It seems a little strange to add
an extra layer when the hardware will do everything you want and more..
One more point - if you're creating hardware sets (or software for that
matter), you're better off using disks on different channels in the
cabinet, ie in different columns of the array, to maximise performance.
This way you're not flooding a single channel with requests to the same
disk set. The HSZ70 has 6 internal scsi busses (across the cabinet), with
4 devices on each (up the cabinet).

cheers,
Andrew
----------------------------------------------------------------------------
--------


---rye_at_jwfc.acom.mil----------------------------------------------------
Hello..

 From the spped description I would check to see if writeback cache is
enabled where possible. The default state of new sets is disabled.

Don
------------------------------------------------------------------------



+-----------------------------------------------------------------+
| TONY MILLER - Systems Projects - VODAFONE LTD, Derby House, |
| Newbury Business Park, Newbury, Berkshire. |
+-------------+---------------------------------------------------+
| Phone | 01635-507687(local) |
| Work email | ANTHONY.MILLER_at_VF.VODAFONE.CO.UK |
| X.400 | G=ANTHONY; S=MILLER; C=GB; A=GOLD 400; P=VODAFONE |
| FAX | 01635-583856 |
+-------------+---------------------------------------------------+

Quotation: "Is the glass half full or half empty?? ...
               Well, drink it anyhow, that's what I say".
  Pete Goss.


Disclaimer: Opinions expressed in this mail are my own and do not
reflect the company view unless explicitly stated. The information
is provided on an 'as is' basis and no responsibility is accepted for
any system damage howsoever caused.

======Original
Posting====================================================================

All...

Here is a weird problem - Dunix v4.0d as8400 5 cpu's.

1 x ESA10000 cabinet. Each zone has a pair of dual redundant hsz70's It
WAS populated with 9gb disks (the exact configuration was not important) and
all was working ok.

I have replaced the 9's with 18's. The top zone is configured (stripe sets
etc) as an identical copy of the lower zone. The LSM volumes will be
mirrored between the two zones.


Now...

I have had no problems with the top 3 (of 4) shelves in each zone. The
stripe sets (2 x 3 disk stripe sets per shelf) are created, the units are
created, the device special files are created, the disk labels are written,
the disks are added to LSM and the volumes are created and mirror sets set
up.

The only issue was that voldiskadd took a long time (say 3-4 minutes but I
didn't time it exactly) to add the disks to the disk group. I put this down
to the system being very busy (load average around 5-7 all the time).


However...

The bottom shelf in each zone is causing me problems. The disks initialised
(but took say 30 seconds each to complete) ok (it was odd that the
initialisation did not respond immediately as has been the case in the
past). 2 stripe sets created ok. The 2 stripe sets were initialised ok and
units added ok.

I did a 'scu scan edt bus 17' followed by a 'scu show edt bus 17 | wc -l'.
The 'scu show...' showed 2 more than the same command before the 'scu
scan...' - all just as expected.

I created the device special files ok. A 'disklabel -z' followed by a
'disklabel -wr rz??? HSZ70' created the disk label. A 'disklabel -r rz???'
showed the c partition with the correct number of sectors.

Now, the voldiskadd takes say 5 minutes then fails with the following
message:

  Initialization of disk device rzb136 failed.
  Error: voldisk: Device rzb136: define failed:
        Disk write failure


I originally put this down to some problem with one of the disks. However
this seems not to be the case. I got the same problem with the 2nd stripe
set (top zone bottom shelf).

I then got exactly the same problem with BOTH stripe sets in the lower zone.
This is weird as the lower zone is connected to its own pair of HSZ's.


So...

I deleted the device special files for the (4) offending disks. Did a 'scu
scan edt bus 17' & '...bus 18' to blow away any references to these devices.
Deleted the units, stripe sets and disks in question from the two zones.

Then...did the whole thing again... with exactly the same problem.

I did a 'newfs' of the a partition (131072) blocks. This does complete but
is very slow (say 2 minutes. It does not respond with the usual zippy speed
you expect. Same for the 'b' partition of 262144 blocks.

I repeated the undo procedure again documented above after restarting both
of the redundant pair of HSZ's. Same problem.

I can write to these disks but it seems VERY slow. Any ideas what to do
next? Has anybody seen this before?


Thanks - Tony


PS... By the way, if I dont know what jumbo patch (1, 2, 3 etc) has been
installed, whats the easy way to find out?
Received on Thu May 20 1999 - 07:42:32 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:39 NZDT