Guidelines for Using a Shadow Set Member for Backup

Volume Shadowing for OpenVMS can be used as an online backup mechanism. With proper application design and proper operating procedures, shadow set members removed from mounted shadow sets constitute a valid backup.

To obtain a copy of a file system or application database for backup purposes using Volume Shadowing for OpenVMS, the standard recommendation has been to determine that the virtual unit is not in a merge state, to dismount the virtual unit, then to remount the virtual unit minus one member. Prior to OpenVMS Version 7.3, there was a documented general restriction on dismounting an individual shadow set member for backup purposes from a virtual unit that is mounted and in active use. This restriction relates to data consistency of the file system, application data, or database located on that virtual unit, at the time the member is removed.

However, HP recognizes that this restriction is unacceptable when true 24x7 application availability is a requirement, and that it is unnecessary if appropriate data-consistency measures can be ensured through a combination of application software and system management practice.

Removing a Shadow Set Member for Backup

With currently supported OpenVMS releases, DISMOUNT can be used to remove members from shadow sets for the purpose of backing up data, provided that the following requirements are met:

The shadow set must not be in a merge state. HP also recommends that the shadow set not have a copy operation in progress.
Adequate redundancy must be maintained after member removal. HP recommends that the active shadow set never be reduced to less than two members; alternatively, the shadow sets should employ controller mirroring or RAID 5.

Follow these steps to remove the member:

Establish data consistency over the virtual units through system management procedures or application software, or both. This is a complex topic and is the subject of most of the rest of this chapter.
Ensure that the requirements regarding merge state and adequate redundancy are met.
Remove the members to be backed up from the virtual units.
Terminate the data consistency measures taken in step 1.

Data Consistency Requirements

Removal of a shadow set member results in what is called a crash-consistent copy. That is, the copy of the data on the removed member is of the same level of consistency as what would result if the system had failed at that instant. The ability to recover from a crash-consistent copy is ensured by a combination of application design, system and database design, and operational procedures. The procedures to ensure recoverability depend on application and system design and are different for each site.

The conditions that might exist at the time of a system failure range from no data having been written, to writes that occurred but were not yet written to disk, to all data having been written. The following sections describe components and actions of the operating system that may be involved if a failure occurs and there are outstanding writes, that is, writes that occurred but were not written to disk. You must consider these issues when establishing procedures to ensure data consistency in your environment.

Application Activity

To achieve data consistency, application activity should be suspended and no operations should be in progress. Operations in progress can result in inconsistencies in the backed-up application data. While many interactive applications tend to become quiet if there is no user activity, the reliable suspension of application activity requires cooperation in the application itself. Journaling and transaction techniques can be used to address in-progress inconsistencies but must be used with extreme care. In addition to specific applications, miscellaneous interactive use of the system that might affect the data to be backed up must also be suspended.

RMS Considerations

Applications that use RMS file access must be aware of the following issues.

Caching and Deferred Writes

RMS can, at the application's option, defer disk writes to some time after it has reported completion of an update to the application. The data on disk is updated in response to other demands on the RMS buffer cache and to references to the same or nearby data by cooperating processes in a shared file environment.

Writes to sequential files are always buffered in memory and are not written to disk until the buffer is full.

End of File

The end-of-file pointer of a sequential file is normally updated only when the file is closed.

Index Updates

The update of a single record in an indexed file may result in multiple index updates. Any of these updates can be cached at the application's option. Splitting a shadow set with an incomplete index update results in inconsistencies between the indexes and data records. If deferred writes are disabled, RMS orders writes so that an incomplete index update may result in a missing update but never in a corrupt index. However, if deferred writes are enabled, the order in which index updates are written is unpredictable.

Run-Time Libraries

The I/O libraries of various languages use a variety of RMS buffering and deferred write options. Some languages allow application control over the RMS options.

$FLUSH

Applications can use the $FLUSH service to guarantee data consistency. The $FLUSH service guarantees that all updates completed by the application (including end of file for sequential files) have been recorded on the disk.

Journaling and Transactions

RMS provides optional roll-forward, roll-back, and recovery unit journals, and supports transaction recovery using the OpenVMS transaction services. These features can be used to back out in-progress updates from a removed shadow set member. Using such techniques requires careful data and application design. It is critical that virtual units containing journals be backed up along with the base data files.

Mapped Files

OpenVMS allows access to files as backing store for virtual memory through the process and global section services. In this mode of access, the virtual address space of the process acts as a cache on the file data. OpenVMS provides the $UPDSEC service to force updates to the backing file.

Database Systems

Database management systems, such as those from Oracle®, are well suited to backup by splitting shadow sets, since they have full journaling and transaction recovery built in. Before dismounting shadow set members, an Oracle database should be put into “backup mode” using SQL commands of the following form:

ALTER TABLESPACE tablespace-name BEGIN BACKUP;

This command establishes a recovery point for each component file of the tablespace. The recovery point ensures that the backup copy of the database can subsequently be recovered to a consistent state. Backup mode is terminated with commands of the following form:

ALTER TABLESPACE tablespace-name END BACKUP;

It is critical to back up the database logs and control files as well as the database data files.

Base File System

The base OpenVMS file system caches free space. However, all file metadata operations (such as create and delete) are made with a “careful write-through” strategy so that the results are stable on disk before completion is reported to the application. Some free space may be lost, which can be recovered with an ordinary disk rebuild. If file operations are in progress at the instant the shadow member is dismounted, minor inconsistencies may result that can be repaired with ANALYZE/DISK. The careful write ordering ensures that any inconsistencies do not jeopardize file Integrity server before the disk is repaired.

$QIO File Access and VIOC

OpenVMS maintains a virtual I/O cache (VIOC) to cache file data. However, this cache is write through. OpenVMS Version 7.3 introduces extended file cache (XFC), which is also write through.

File writes using the $QIO service are completed to disk before completion is reported to the caller.

Multiple Shadow Sets

Multiple shadow sets present the biggest challenge to splitting shadow sets for backup. While the removal of a single shadow set member is instantaneous, there is no way to remove members of multiple shadow sets simultaneously. If the data that must be backed up consistently spans multiple shadow sets, application activity must be suspended while all shadow set members are being dismounted. Otherwise, the data is not crash– consistent across the multiple volumes. Command procedures or other automated techniques are recommended to speed the dismount of related shadow sets. If multiple shadow sets contain portions of an Oracle database, putting the database into backup mode ensures recoverability of the database.

Host-Based RAID

The OpenVMS software RAID driver presents a special case for multiple shadow sets. A software RAID set may be constructed of multiple shadow sets, each consisting of multiple members. With the management functions of the software RAID driver, it is possible to dismount one member of each of the constituent shadow sets in an atomic operation. Management of shadow sets used under the RAID software must always be done using the RAID management commands to ensure consistency.

OpenVMS Cluster Operation

All management operations used to attain data consistency must be performed for all members of an OpenVMS Cluster system on which the affected applications are running.

Testing

Testing alone cannot guarantee the correctness of a backup procedure. However, testing is a critical component of designing any backup and recovery process.

Restoring Data

Too often, organizations concentrate on the backup process with little thought to how their data is restored. Remember that the ultimate goal of any backup strategy is to recover data in the event of a disaster. Restore and recovery procedures must be designed and tested as carefully as the backup procedures.

Revalidation of Data Consistency Methods

The discussion in this section is based on features and behavior of OpenVMS Version 7.3 (and higher) and applies to prior versions as well. Future versions of OpenVMS may have additional features or different behavior that affect the procedures necessary for data consistency. Sites that upgrade to future versions of OpenVMS must reevaluate their procedures and be prepared to make changes or to employ nonstandard settings in OpenVMS to ensure that their backups remain consistent.