HP OpenVMS Systems Documentation |
Guide to OpenVMS File Applications
3.2.1.3.2 Establishing Auto Extend Default QuantitiesThis section describes how to establish the auto extend default quantities for an RMS file.
Area and File Default Extension Quantities
You can establish a file-specific default, called the file default extension quantity, for all file organizations. In the case of an indexed file with multiple areas, you can also establish a separate area default extension quantity for each area of the indexed file. The following list describes the methods for establishing file default extension quantities, and, where applicable, area default extension quantities:
Process Default Extension Quantity
The process default extension quantity is established by the DCL command SET RMS_DEFAULT/EXTEND_QUANTITY=n, where n is the extension quantity. This default applies only to the process issuing this DCL command and remains in effect only until the process is deleted.
System Default Extension Quantity
The system default extension quantity is established by the SET RMS_DEFAULT/SYSTEM/EXTEND_QUANTITY=n command. Note that you need the CMKRNL privilege to use the /SYSTEM qualifier. This default applies to all processes on a node in the cluster. When you use this DCL command to establish the system default extension quantity, it remains in effect until the node is rebooted. You can also establish the system default extension quantity in a temporary or permanent fashion by appropriately setting the SYSGEN system parameter RMS_EXTEND_SIZE.
Volume Default Extension Quantity
The volume default extension quantity can be permanently established
when the volume is initialized with the INITIALIZE/EXTENSION=n command.
This default quantity is used whenever the volume is mounted. To
permanently change the volume default extension quantity, you can use
the SET
VOLUME/EXTENSION=n command on a mounted disk. To temporarily establish
a volume default extension quantity or temporarily override the
permanent volume default extension quantity, use the MOUNT/EXTENSION=n
command. The new default is in effect until the volume is dismounted.
Unlike the other default quantities described that default to zero if
not specified, the volume default extension quantity defaults to 5 if
not specified.
In addition to specifying the size of an extend, you can specify other characteristics that affect the placement and contiguity of the extend. When an application extends a file by calling the $EXTEND service, an Allocation XAB (XABALL) can be used to place an extend on a particular disk block or disk cylinder. If no allocation XAB is present on the $EXTEND and the FAB contiguity options (described later in this section) are not selected, RMS automatically places the extend near the last allocated disk block in the file. If the file being extended in this fashion is an indexed file opened for record I/O access, RMS adds the new disk space as near as possible to the last allocated disk block in the area being extended. This technique groups disk blocks belonging to the same area of the indexed file. When RMS automatically extends a file, the application cannot control placement; however, RMS uses placement controls in one of the following ways, depending on how the file is organized:
When RMS automatically extends a file, the application can only indirectly control contiguity by setting the FAB or XABALL contiguity bits on input to the $CREATE service. Once set on file creation, these options are available for subsequent extends done automatically by RMS.
Note that setting the FAB$V_CTG bit could cause an extend to fail on a
sufficiently fragmented disk. Note too, that the FAB$V_CBT option is
disabled after several failures to allocate contiguous disk space to
avoid the expensive overhead of contiguous best try processing on a
badly fragmented disk.
Only RMS sequential disk files that have been opened for write access (FAB$V_PUT, FAB$V_UPD, FAB$V_DEL or FAB$V_TRN) can be truncated. This applies to unshared and shared sequential files. Two types of truncation can occur on RMS sequential files: RMS truncation and Ancillary Control Procedure (ACP) truncation. RMS truncation involves resetting the end-of-file (EOF) pointer back to a previous position (possibly the beginning) of a sequential file to reuse the allocated space in a file. RMS truncation is described in the OpenVMS Record Management Services Reference Manual under the $TRUNCATE service. ACP truncation occurs when RMS closes a sequential file and requests that the ACP deallocate all disk blocks allocated beyond the EOF of the file. The primary use of ACP truncation is to conserve disk space. The remainder of this section deals with ACP truncation. You can also use ACP truncation in conjunction with large extend sizes to reduce disk fragmentation. If a file is growing slowly over time, the application can allocate the largest possible extend, and when finished, it can use ACP truncation to deallocate any unused space at the end of the sequential file. However, if a sequential file is continually growing, excessive ACP truncation can lead to an increase in disk fragmentation resulting in more CPU and I/O overhead. ACP truncation can be requested directly by way of the programming interface by setting the FAB$V_TEF bit on input to the $OPEN, $CREATE, or $CLOSE service. The ACP truncation occurs on the close of the sequential file. Note that ACP truncation can occur on shared as well as unshared sequential files. If there are shared readers of the file, ACP truncation is postponed until the last reader of the file closes the file. If there are other writers of a shared sequential file, then ACP truncation requests are ignored. However, the ACP truncation request of the last writer to close the file will be honored.
ACP truncation of a sequential file can be automatically requested by
RMS if an auto extend has been done during this file access and no file
default extend quantity exists to be used for the auto extend. Using
ACP truncation in this instance avoids wasting space when auto
extending with a less precise extend quantity default, such as the
system default extend quantity.
Another file design consideration is to reduce the number of times that RMS must transfer data from disk to memory by making the I/O units as large as possible. Each time RMS fetches data from the disk, it stores the data in an I/O memory buffer whose capacity is equal to the size of one I/O unit. A larger I/O unit makes more records immediately accessible to your program from the I/O buffers. In general, using larger units of I/O for disk transfers improves performance, as long as the data does not have to be swapped out too frequently. However, as the total space used for I/O buffers in the system approaches a point that results in excessive paging and swapping, increasing I/O unit size degrades system performance. RMS performs I/O operations using one of the following I/O unit types:
The other I/O units---multiblocks and buckets---are made up of block
multiples.
A multiblock is an I/O unit that includes up to 127 blocks but
whose use is restricted to sequential files. See Section 3.3.2 for
details.
A bucket is the I/O unit for relative and indexed files and it
may include up to 63 blocks. See Section 3.4 and Section 3.5 for
details.
For indexed files, another design strategy is to separate the file into multiple areas. Each area has its own extension size, initial allocation size, contiguity options, and bucket size. You can minimize access times by precisely positioning each area on a specific volume, cylinder, or block.
For instance, you can place the data area on one volume of a volume set
and place the indexed area on another volume. If your application is
I/O bound, this strategy could increase its throughput. You can also
derive data bucket contiguity by allocating extra space for future
extensions of the data area.
Any attempt to insert a record into a filled bucket results in a bucket split. When a bucket split occurs, RMS tries to keep half of the records (including the new record, if applicable) in the original bucket and moves the remaining records to a newly created bucket. Excessive bucket splitting can degrade system performance through wasted space, excessive processing overhead, and file fragmentation. For example, each record that moves to a new bucket must maintain a forward pointer in the original bucket. The forward pointer indicates the record's new location. At the new bucket, the record must maintain a backward pointer to its original bucket. RMS uses the backward pointer to update the forward pointer in the original bucket if the record later moves to yet another bucket. When a program attempts to access a record either by alternate key or by RFA, it must first go to the bucket where the record originally resided, read the pointer to the record's current bucket residence, and then access the record. To avoid bucket splits, you should permit buckets to be only partially filled when records are initially loaded. This provides your application with space to make additional random inserts without overfilling the affected bucket.
Section 3.5.2.2 describes fill factors in more detail.
Five processing options can be used to improve I/O operations: two file-processing options and three record-processing options. The file-processing options include the deferred-write option and the global buffer option. The global buffer option may be used with all three file organizations, but the deferred-write option is restricted to use with relative and indexed files. The record-processing options include the multiple buffer option, the read-ahead option and the write-behind option. The multiple buffer option may be used with all three file organizations, but the read-ahead option and the write-behind option may be used only with sequential files.
This section summarizes the options. Section 3.3 through
Section 3.5 describe the options in the context of tuning files. For
additional information about buffering, see Chapter 7.
When a program accesses a data file, it transfers the file from disk into memory using I/O units of blocks, multiblocks, or buckets. The I/O units are subsequently placed in memory I/O buffers sized to be compatible with the I/O units. If you do not have enough buffers, excessive I/O transfers may degrade the performance of your application. On the other hand, if you have too many buffers, performance may degrade because of an overly large working set. As a rule, however, increasing the size and number of buffers can improve performance if the data read into the buffers will soon be processed and if your working set can efficiently maintain the buffers. In fact, changing the size and number of buffers is the quickest way to improve the performance of your application when you are processing localized data. The optimum number of buffers depends on the organization and use of your data files. The recommended way to determine the optimum number of buffers for your application is to use the Edit/FDL utility. The number of I/O buffers is a run-time parameter you set with the FDL editor by adding the CONNECT secondary attribute MULTIBUFFER_COUNT to the definition file. (See Chapter 9.) With a low-level language, you can set the value directly into the RAB$B_MBF field of the record access block, or you can set the count using the XAB$_MULTIBUFFER_COUNT XABITM if you want to specify more than 127 buffers. Alternatively, the number of buffers may be specified for a process using the DCL command SET RMS_DEFAULT/BUFFER_COUNT=n, where the variable n represents the desired number of buffers. With this command, you may set distinct values for your sequential, relative, and indexed files using the appropriate file organization qualifier. If you omit the file organization qualifier, sequential organization is assumed. To examine the current settings for the process and system default multibuffer count, use the DCL command SHOW RMS_DEFAULT. If none of the above methods is used, RMS uses the system-wide default value established by the system manager. If the system-wide default is either omitted or is set to 0, RMS uses a value of 1 for sequential and relative files and a value of 2 for indexed files. For more details about using multiple buffers with sequential files, see Section 3.3.3. For more details about using multiple buffers with relative files, see Section 3.4.2. For more details about using multiple buffers with indexed files, see Section 3.5.2.3.
Chapter 7 describes the use of multiple buffers in the context of
shared files.
One way to improve performance through minimized I/O is to use the deferred-write option to keep data in memory as long as practicable. However, you must determine if this added performance benefit is worth the increased risk of losing data if the system crashes before a buffer is transferred to disk. With indexed files and relative files, you may use the deferred-write option to defer writing modified buckets to disk until the buffer is needed for another purpose or until the file is closed. Typically, the largest gains in performance come from using the deferred-write option with sequential access. Retrieving and modifying records one after the other permits you to access all of the records from one bucket while the bucket is in memory. You may also improve performance by using the deferred-write option to prevent writing a shared sequential file to disk on each modification. However, this increases the risk of losing data if the system crashes before the full buffer is transferred to disk. Note that nonshared sequential files behave as if the deferred-write option is always specified, because a buffer is only written to disk after it is completely filled.
Deferred-write processing is a default run-time option for some
high-level languages and can be specified by using clauses in other
languages. You can activate this option through FDL by adding the FILE
attribute DEFERRED_WRITE. From a low-level language, you can activate
the deferred-write option by setting the FAB$V_DFW bit in FAB$L_FOP
field.
If several processes are to share a file, you may want to provide the file with global buffers---I/O buffers that two or more processes can access. With global buffers, processes may access file information without allocating dedicated buffers. If you do not allocate dedicated buffers, you can conserve buffer space and buffer management overhead. You gain this benefit at the cost of additional system resources, as described in the OpenVMS Record Management Services Reference Manual. When you create a file, you can assign the desired number of global buffers by using the FDL editor to set the value for the FILE secondary attribute GLOBAL_BUFFER_COUNT. From a low-level language, you can optionally set the value directly into the FAB$W_GBC field. Alternatively, you may use the DCL command SET FILE/GLOBAL_BUFFERS to set the global buffer count. Global buffers are not used directly to retain modified information when the deferred-write option is enabled. If a global buffer is modified and the deferred-write option is enabled, the contents of the global buffer are copied to a process local buffer before other processes are allowed to access the global buffer contents. Subsequent use of the modified buffer by the process that deferred the writeback refer to the process local buffer while it remains in the process local cache. Reference to the global buffer by another process causes the contents of the process local buffer to be written back to disk. If a global buffer is modified and the deferred-write option is not enabled, then the contents are written out to disk from the global buffer. Therefore, using global buffers along with the deferred-write option may cause a slight increase in processing overhead if in fact no further references to the modified buffer occur before it is flushed from the cache anyway. For that reason, you may want to disable the deferred-write option for processes that do not reaccess buffers after records have been written to them.
Section 3.3, Section 3.4, and Section 3.5 discuss the use of
global buffers in tuning the various file types.
The operation of sequentially organized files can be improved by implementing read-ahead and write-behind processing. These features improve performance by permitting record processing and I/O operation to occur simultaneously. The read-ahead and write-behind features are default run-time attributes in some languages, but they must be explicitly specified in others. You implement read-ahead and write-behind processing by using two buffers. The processing program uses one buffer, and the I/O subsystem uses the other. In read-ahead processing, the program reads data from one buffer as the second buffer inputs data from the disk. In write-behind processing, one buffer accepts output data from the program, while the second buffer outputs program data to a disk. The next section provides additional information about read-ahead and write-behind processing.
|