![]() |
![]() HP OpenVMS Systemsask the wizard |
![]() |
The Question is: I have a client running OpenVMS 7.1 on a CI cluster with AXPs and VAXes and HSJ40s and HSJ50s. They have implemented phase II volume shadowing. We found this statement in: "DIGITAL StorageWorks,HSJ30/40 Array Controller Operating Software, HSOF Version 3.2, Release Notes" Fast Shadow Member Eviction An MSCP flag is provided to enable rapid shadow member eviction when an device error is detected. OpenVMS can set this flag based on SYSGEN parameters in the field SHADOW_SYS_DISK. The MSCP flag is called MD.SER. When set, and an I/O encounters a device error, the I/O is returned as failed without further error recovery. OpenVMS can then evict a shadowset member, as appropriate. But when we look up the sysgen parameter SHADOW_SYS_DISK in the "Volume Shadowing for OpenVMS" manaul, the only values described are 0 and 1. "All values other than the value of 1 are reserved for use by Digital." So what was the author of the HSJ30/40 manual describing? And is Fast Shadow Member Eviction available in OpenVMS? And if so what value should SHADOW_SYS_DISK be set to? The Answer is : Certain customer applications have critical response time requirements. While performing READ I/O operations to a multiple member shadow set, these customer applications prefer minimal error recovery be done for "certain" I/O operations to certain volumes. The application assumes that if there is a multiple member shadow set with problems developed on one member, the the READ I/O operation can be satisfied from another "good" member. Most volume shadowing customers rely on having just this behaviour. The READ of one member comes back to the SHDRIVER with an error, and the driver then reads another member of the set successfully, and then writes that data to the "bad" member, making it good again. (This is a very simple explanation of SHDRIVER error recovery processing.) To accommodate the customer requirement for critical response time, HSOF was modified to honor the use of the QIO function modifier, IO$M_INHRETRY. Customer application(s) would then have to modify all pertinent READ I/O operations to include that modifier. To facilitate this for customer use (without requiring application modification), the SHDRIVER was modified to assume that bit was set to all READ I/Os that are receives if the SHADOW_SYS_DISK parameter bit 15 (ie: %X00008000 ) is set -- the retry is inhibited for every virtual unit, for every READ I/O operation. Consider taking this one step further, where the application does not want the SHDRIVER to do ANY error recovery. Once the SHDRIVER receives the I/O back from the controller, the application wants that member to be expelled immediately and to have the READ I/O sent to another member. That was accommodated also, with the use of another SHADOW_SYS_DISK flag, bit 13 (ie: %X00002000). It should be noted that if a single member shadow set contains one "bad" LBN -- "bad" being here defined as an LBN that returns SS$_PARITY or SS$_FORCEDERROR for a READ I/O operation -- that once a new member is added to that set, that "bad" LBN(s) will be replicated to the new member. Any READ I/O operation to that "bad" LBN -- with SHADOW_SYS_DISK bit 13 cleared -- will NOT expel members. The SHDRIVER will read all of the members, find them all bad, and will return an error to the application as it cannot "repair" the shadowset. (With bit 13 set, the repair and recovery logic is disabled.) Enabling this capability in the system parameters indicates that the application is willing to tolerate the occasional "dismemberment" of a shadowset, and that the application is also willing to tolerate any errors (eg: parity errors) that might be returned. By enabling SHADOW_SYS_DISK bit 15, the controller will not perform any particular error recovery -- with the bit disabled, there might well be no error returned to the application. With a single-member shadowset -- either as originally configured, or as a result of "dismemberment" -- there is no error recovery, and all errors are returned to the application. The following kits (or later) are required: ALPSHAD04_071 or ALPSHAD07_062 VAXSHAD04_071 or VAXSHAD07_062 The following kits (or later) are recommended: ALPMOUN04_071 or ALPMOUN03_062 VAXMOUN03_071 or VAXMOUN02_062
|