(SUMMARY) Trouble with Dev.Sp.Files on V5.1 from ^IT_MPG_UNIX_ADMIN on 2001-03-03 (tru64-unix-managers)

From: ^IT_MPG_UNIX_ADMIN <^IT__MPG__UNIX__ADMIN_at_ccmail.allegiance.net>
Date: Fri, 02 Mar 2001 15:01:47 -0600

     Thanks to -

     Dr. Tom Blinn
     Colin Waters
     Alan
     Irving Waldo

     Responses

     1.

     What happened is exactly what is supposed to happen. The HSZ70 tells
     the operating system some set of "WWID" values for the logical units
     on the controller. If you replace the HSZ70 with a different unit,
     you have to configure the new unit to export exactly the same WWID
     values for the equivalent logical units, or else from the perspective
     of the UNIX operating system, it's just as if the old unit was just
     off-line and the new unit was totally unknown. In fact, it sure
     sounds like as far as UNIX is concerned, the old unit is still active
     (do you have any mounted file systems or open database files or
     whatever on those units?).

     I'm sure this is all documented somewhere, probably in the most up to
     date HSZ70 hardware manuals, since it's not really a UNIX issue, it's
     an HSZ configuration issue. There MIGHT be information on it in the
     TruCluster documentation, I'm not sure.

     And yes, what happened is exactly what is supposed to happen; if these
     were direct attached disks instead of being behind a SCSI attached
     RAID controller, UNIX would have seen that the WWIDs in the disks
     themselves (if they are new enough to have WWIDs) haven't changed, and
     it would've preserved the "dsk" names. That works with direct
     attached disks even if you replace the SCSI controller with a
     different model, or move it to a different I/O slot. But with the
     RAID attached disks, the thing that UNIX sees is a software construct
     provided by the RAID controller, and you have to tell the RAID
     controller to export the save values for the logical unit WWIDs that
     were exported by the old unit.

     Tom Blinn


     2.

     Alay,

     Use hwmgr -show scsi to determine the SCSI DEVICEID (did) for the
     devices, and then use the redirect option:

     # hwmgr -redirect scsi -src <source_did> -dest <destination_did>

     For example:

     # hwmgr -redirect scsi -src 18 -dest 1 <destination_did>.

     Note that if the output from -show scsi indicates that there are no
     valid paths for either disk, you might have to use -delete scsi to
     remove the pathless disk. You can then
     use dsfmgr.

     Regards,

     Colin

     Colin Walters


     3.

     For parallel SCSI devices, V5 constructs a unique ID for the device by
     using the serial number from one of the Inquiry mode pages. Since an
     HSZ70 presents multiple logical units, it probably uses its own serial
     number and something unique for each LUN as the serial number for the
     LUN. Odds are, when you changed the controller, the serial number
     changed, which changed each device. With the new serial number, the
     operating system had no choice but to treat them as new devices.

     I have a recollection of some internal commentary on this subject and
     it may have made into the "Best Practices" document. I don't have a
     clue where to look for this though.

     Unrelated to this, I have to point out, that as far as I remember,
     Storage doesn't support running pairs of redundant controllers except
     as a redundant pair.


     Alan


     What did I do-


     I realised that the problem I was facing was because of the change in
     the HSZ70 disk controller. The serial numbers changed and the OS gave
     the controller a new HWID and this started all the problems. In the
     end I ended up doing the following:

     (Remember dsk1 was my original device and dsk18 was what the system
     was calling it now)

     #dsfmgr -Z rm_cluster_hwid <dsk1's hwidno> 0
     #dsfmgr -Z rm_cluster_hwid <dsk1's hwidno> 0

     Now if you cannot find the HWID in the "hwmgr -show devices" output
     look into /etc/dfsc.dat and the fourth column value would be the hwid
     assigned to your dskxxx.

     Then a simple
     #dsfmgr -m dsk18 dsk1


     A reboot to ensure everything was okay !!!

     Once again, Thanks to all who helped!!!

     Tks & Rgds,

     Alay Shah.








     ______________________________ Forward Header
     __________________________________
     Subject: Trouble with Device Special Files on V5.1
     Author: ^IT_MPG_UNIX_ADMIN at MPG001
     Date: 3/2/01 1:17 PM


     We just replaced a disk controller (An HSZ70). This is a cluster of 2
     HSZ70's(Not configured for dual redundancy) and 2 4100's running V5.1
     Patch Kit 2. After I replaced this disk controller although all the
     units on this disk controller were, dsk1...dsk7, now the system
     decided to rebuild new device special file names for them
     dsk18....dsk24. Why is this? Is this supposed to happen if you
     replaced this disk controller even if the bus stays the same. To fix
     this I tried a "dsfmgr -m dsk18 dsk1" but I get dsk1a is active
     message.

     Help!!

     Tks & Rgds,

     Alay.
     Allegiance Healthcare Corporation
     Email - shahal_at_allegiance.net
Received on Fri Mar 02 2001 - 21:05:32 NZDT

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:41 NZDT