  | 
		
OpenVMS User's Manual
 
 
11.8.2 Omitting Records and Fields
From a specification file, you can improve Sort efficiency by using the
/CONDITION, /INCLUDE, and /OMIT qualifiers to process only those
records needed in the output file. (The high-performance Sort/Merge
utility does not support specification files. Implementation of this
feature is deferred to a future OpenVMS Alpha release.) You can also
use specification file qualifiers to reformat records, omitting
unnecessary fields from the output file. These qualifiers are not
available as command line qualifiers.
11.8.3 Assigning Work Files
 
During a Sort operation, records from the input file are read into
memory. If the allocated memory cannot hold all the records, Sort
transfers the sorted data to one or more temporary work files. Merge
does not use work files.
 
You can increase sort efficiency by changing the number of work files
and by assigning them to specific devices:
 
  - The Sort command line qualifier /WORK_FILES=n overrides
  the number of work files allocated.
  
 -  Normally, Sort places work files on the device SYS$SCRATCH and
  accesses them in an arbitrary order. You can assign work files to
  specific devices in two ways:
  
    - In a specification file, the /WORK_FILES=(device,...)
    qualifier places the work files on the specified devices. See
    Section 11.9.3 for more information about using the /WORK_FILES
    qualifier in a specification file.
    
 -  If you are not using a specification file, you can use the DCL
    command ASSIGN to assign the work files to specific devices. 
 Sort
    uses the SORTWORKn logical names to identify user-specified
    device names for the workfiles, where n is a value from 0
    through 9. (For the high-performance Sort/Merge utility, n is
    a value from 0 to 254.) Define a SORTWORKn logical as follows:
       For example,
 
 
  
    
       
      
$ ASSIGN WORK$2: SORTWORK1
$ ASSIGN WORK$3: SORTWORK2
 
 |   
       This example defines SORTWORK1 as the device WORK$2: and SORTWORK2
      as the device WORK$3:. For more information on logical names, see
      Chapter 13.)
    
  
Consider the following when you assign work files to devices:
 
  - Assign work files to the fastest devices available. For example,
  random-access, mass storage devices such as disks.
  
 - Choose devices with the least activity and the most space available.
  
 - Assign each work file to a different physical device to maximize
  overlapping input and output.
  
11.8.4 Modifying the Working Set Extent
If Sort requires work files (for example, if you are sorting a large
file), a larger working set can increase sort efficiency. However, if
your system is used heavily, it might be unable to allocate all the
pages in the working set extent to your process. This can result in
paging, which occurs when the operating system transfers parts of a
process between physical memory and memory on a paging device; only the
active part of the process remains in the physical memory. To avoid
excessive paging, you can decrease the working set extent for your
process. (Use the SET WORKING_SET command to decrease the working set
extent.)
11.9 Summary of Sort/Merge Qualifiers
 
The following list describes command qualifiers used with the SORT and
MERGE commands. To use a command qualifier, include the qualifier
immediately after the SORT or MERGE command.
 
/[NO]CHECK_SEQUENCE
 
  This qualifier applies to the MERGE command only. It verifies the
  sequence of the records in MERGE input files. Merge checks the sequence
  of records by default.  The /CHECK_SEQUENCE qualifier checks whether
  the records of one or more files (up to 10; the high-performance
  Sort/Merge utility supports up to 12) have been sorted. (The records
  will still be directed to an output file, which you must specify.) If
  you are checking whether records are sorted on a key field other than
  the entire record, you must specify key information, along with the
  requesting sequence.  Use the /NOCHECK_SEQUENCE qualifier to prevent
  Merge from checking the sequence of records.
   Example
 
  
    
       
      
$ MERGE/KEY=(SIZE:4,POSITION:3)/NOCHECK_SEQUENCE -
_$ PRICE1.DAT,PRICE2.DAT PRICE.LIS
 
 |   
     In this example, the /NOCHECK_SEQUENCE qualifier specifies that the
    sequence of the input files, PRICE1.DAT and PRICE2.DAT, is not to be
    checked.
 
/COLLATING_SEQUENCE=sequence
 
  Selects one of three predefined collating orders for character key
  fields, or specifies the name of a National Character Set (NCS)
  collating sequence to be used in comparing character keys. (The
  high-performance Sort/Merge utility does not support the NCS collating
  sequences. Support for NCS collating sequences is deferred to a future
  OpenVMS Alpha release.) Sort can arrange characters in ASCII (default),
  EBCDIC, or Multinational sequences.  Example
 
  
    
       
      
$ SORT/COLLATING_SEQUENCE=MULTINATIONAL -
_$ NAMES.DAT,NOM.DAT LIST.LIS
 
 |   
     This SORT command arranges the input files NAMES.DAT and NOM.DAT
    according to the Multinational collating sequence to create the output
    file LIST.LIS.
 
/[NO]DUPLICATES
 
  By default, Sort retains all multiple records with duplicate keys. The
  /NODUPLICATES qualifier eliminates all but one of multiple records with
  duplicate keys. The retained records may not appear in the same order
  as they appeared in the input file. If you want to specify which
  duplicate record to keep, invoke Sort at the program level and specify
  an equal-key routine.  The /STABLE and the /NODUPLICATES qualifiers
  are mutually exclusive.  Example
 
  
    
       
      
$ SORT/KEY=(POSITION:3,SIZE:5,DECIMAL)/NODUPLICATES -
_$ ACCT1,ACCT2 ACCT.LIS
 
 |   
     This SORT command arranges the two input files according to the key
    supplied and eliminates all but one of multiple records with equal keys.
 
/KEY=(POSITION:n,SIZE:n[,field,...])
 
  Describes key fields, including the position, size, sorting order
  (ASCENDING or DESCENDING), priority (NUMBER:n), and data type (such as
  character, binary, h_floating). By default, Sort reorders a file by
  sorting entire records with character data in ascending order.  See
  Section 11.2.1 for detailed information about the /KEY qualifier.
 
/PROCESS=type
 
  (Applies to the SORT command only.) Defines the internal sorting
  process. The /PROCESS qualifier allows you to choose one of four
  processes: record, tag, address, or index. (The high-performance
  Sort/Merge utility supports only the record process. Implementation of
  tag, address, and index processes is deferred to a future OpenVMS Alpha
  release.)  See Section 11.2.6 for detailed information about the
  /PROCESS qualifier.  Example
 
  
    
       
      
$ SORT/KEY=(POS:40,SIZ:2,DESC)/PROCESS=TAG YRENDAVG.DAT -
_$ DESCYRAVG.LIS
 
 |   
     This Sort operation uses a tag sorting process to create the output
    file DESCYRAVG.LIS.
 
/SPECIFICATION=filespec
 
(The high-performance Sort/Merge utility does not support this
qualifier. Implementation of this feature is deferred to a future
OpenVMS Alpha release.)
 
  Identifies a Sort or Merge specification file to be used in a Sort or
  Merge operation. The default specification file type is .SRT.  See
  Section 11.7 and Section 11.9.3 for information about using
  specification files.
 
/[NO]STABLE
 
  By default, records with equal keys are not guaranteed to be placed in
  the output file in the order they appear in the input file. The /STABLE
  qualifier maintains the records in that order.  The /STABLE and
  /NODUPLICATES qualifiers are mutually exclusive.
   Example
 
  
    
       
      
$ SORT/KEY=(POS:1,SIZ:5,DECIMAL)/STABLE PRICESA.DAT, -
_$ PRICESB.DAT,PRICESC.DAT SUMMARY.LIS
 
 |   
     In this Sort operation, records with equal keys from PRICESA.DAT
    will be listed first, followed by those from PRICESB.DAT, followed by
    those from PRICESC.DAT.
 
/[NO]STATISTICS
 
  Displays a statistical summary to SYS$OUTPUT that can be used for
  optimization. To save these statistics in a file, use the following
  command:
 
  
    
       
      
$ DEFINE/USER SYS$ERROR output-file
 
 |   
     The statistical summary contains the following information:
  
    | Statistic  | 
    Description  | 
   
  
    | 
      Records read
     | 
    
      The number of records read by Sort or Merge.
     | 
   
  
    | 
      Records sorted
     | 
    
      The number of records that have been processed using Sort. This number
      could be less than the number of records read if a specification file
      is used to select only certain records for the Sort or Merge operation.
     | 
   
  
    | 
      Records output
     | 
    
      The number of records written to the output file. This number could be
      less than the number of records sorted if /NODUPLICATES was selected or
      if I/O errors occurred when the output records were being written.
     | 
   
  
    | 
      Working set extent
     | 
    
      The number of pages in the process working set extent. This value is
      used as an upper limit on the size of the sort data structure.
      Adjusting this value is one way to improve the efficiency of a Sort
      operation.
     | 
   
  
    | 
      Virtual memory
     | 
    
      The number of pages of virtual memory added to the Sort image to hold
      the data.
     | 
   
  
    | 
      Direct I/O + buffered I/O
     | 
    
      This total is the number of I/O movements needed to read and write
      data. The lower this total value is, the more efficient the ordering
      operation.
     | 
   
  
    | 
      Page faults
     | 
    
      Indicates how well the data fits into memory: the higher the number of
      page faults, the less efficient the ordering operation.
     | 
   
  
    | 
      Elapsed time
     | 
    
      The total wall clock time used by the Sort or Merge operation in hours,
      minutes, seconds, and hundredths of seconds.
     | 
   
  
    | 
      Input record length
     | 
    
      This value is obtained from the Record Management Services (OpenVMS
      RMS) unless the user supplies it.
     | 
   
  
    | 
      Internal length
     | 
    
      The size in bytes of an internal format node. This includes any keys,
      data, a word to store the length, record file addresses (RFAs), and
      converted keys.
     | 
   
  
    | 
      Output record length
     | 
    
      The length of the output record. The length is computed from the input
      record length, the sort process, and the record reformatting requested.
     | 
   
  
    | 
      Sort tree size
     | 
    
       The number of records that fit in the Sort internal data structure.
     | 
   
  
    | 
      Number of initial runs
     | 
    
      One indication of how well the data fits into memory.
     | 
   
  
    | 
      Maximum merge order
     | 
    
      The maximum number of sorted strings that are merged at one time.
     | 
   
  
    | 
      Number of merge passes
     | 
    
      The number of times the Sort utility merges strings until one sorted
      output string is produced. The number of initial runs and the number of
      merge passes indicate how well the data fits into memory. The higher
      these numbers, the further the working set size is from containing the
      data and the longer the sorting takes.
     | 
   
  
    | 
      Work file allocation
     | 
    
      The number of blocks used for the work files. When more than one merge
      pass is needed, this size is approximately twice the size of the input
      file allocation.
     | 
   
  
    | 
      Elapsed CPU
     | 
    
      The CPU time used by the ordering operation; it does not include time
      spent waiting for I/O operations to complete or time spent waiting
      while another process executes.
     | 
   
 
     Example
 
 
  
    
       
      
$ SORT/STATISTICS PRICE1.DAT,PRICE2.DAT PRICE.LIS
 
 |   
     This SORT /STATISTICS command results in the following statistical
    display:
 
  
    
       
      
              OpenVMS Sort/Merge Statistics
Records read:         793   Input record length:     80
Records sorted:       793   Internal length:         80
Records output:       793   Output record length:    80
Working set extent:   100   Sort tree size:         412
Virtual memory:       433   Number of initial runs:   2
Direct I/O:            22   Maximum merge order:      2
Buffered I/O:           9   Number of merge passes:   1
Page faults:         3418   Work file allocation:   114
Elapsed time: 00:00:05.98   Elapsed CPU:    00:00:03.63
 |   
 
/WORK_FILES[=n]
 
  (Applies to the SORT command only.) Increases the number of Sort work
  files by any number, from 1 to 10 (the high-performance Sort/Merge
  utility supports up to 255) inclusively, to make each work file
  smaller. If the available disks are too small or too full for work
  files, increasing the number of files can improve the efficiency of the
  Sort operation.  Sort does not create work files until it needs
  them. If Sort needs work files, it creates two by default (SORTWORK0,
  SORTWORK1), which are placed in the SYS$SCRATCH directory.
   Example
 
  
    
       
      
$ ASSIGN DRA5: SORTWORK0
$ ASSIGN DB0: SORTWORK1
$ ASSIGN DB1: SORTWORK2
$ SORT/KEY=(POS:1,SIZ:80)/WORK_FILES=3 -
_$ STATS1,STATS2,STATS3,STATS4 SUMMARY.LIS
 
 |   
     Because the input files in this Sort operation are large files,
    specifying three work files improves the efficiency of the sort
    operation.  Note that you can also assign the work files to a
    specific directory on a device by including the directory name. For
    example, to assign SORTWORK0 to the [WORKSPACE] directory on DRA5,
    enter the following command:
 
  
    
       
      
$ ASSIGN DRA5:[WORKSPACE] SORTWORK0
 
 |   
 
11.9.1 Input File Qualifier
The following input qualifier should be included immediately after the
input file specification in the SORT or MERGE command line:
 
/FORMAT=(RECORD_SIZE:n,FILE_SIZE:n)
 
  Defines input file characteristics; allows you to specify or override
  record or file size. It must be specified immediately after the input
  file specification in the Sort or Merge command line.  Sort uses
  input file size information to determine the amount of memory needed,
  as well as the size of the work files for the Sort operation. If the
  file size is unknown (for example, you are sorting files that do not
  reside on disk or standard ANSI magnetic tape), Sort assumes a fairly
  large file size.  Specify the following qualifier values:
  
    | 
RECORD_SIZE:
      n
     | 
    
      Specifies the input file's longest record length (LRL) in bytes. The
      maximum longest record length that can be specified depends on the file
      organization:
      
        
          | 
            Sequential
           | 
          
            32,767
           | 
         
        
          | 
            Relative
           | 
          
            16,383
           | 
         
        
          | 
            Indexed-sequential
           | 
          
            16,362
           | 
         
       
     | 
   
  
    | 
       
     | 
    
      These values include control bytes for variable records with
      fixed-length control (VFC) format.
     | 
   
  
    | 
FILE_SIZE:
      n
     | 
    
      Specifies input file size in blocks. The maximum file size accepted is
      4,294,967,295 blocks.
     | 
   
 
     You can also use /FORMAT as an output file qualifier. See
    Section 11.9.2 for more information.  Example
 
  
    
       
      
$ SORT/KEY=(POS:40,SIZ:2,DESC) -
_$CRA0:YRENDAVG.DAT/FORMAT=(RECORD_SIZE:41,FILE_SIZE:3) -
_$DESCYRAVG.LIS
 
 |   
     Because the input file YRENDAVG.DAT does not reside on a disk
    device or ANSI magnetic tape, file organization must be described by
    the /FORMAT qualifier.
 
11.9.2 Output File Qualifiers
The following output qualifiers can be used with the SORT and MERGE
commands. To use an output file qualifier, include the qualifier
immediately after the output file specification in the SORT or MERGE
command line.
 
/ALLOCATION=n
 
  Specifies the number of blocks, from 1 through 4,294,967,295, to be
  preallocated to the output file for optimization. Use this qualifier
  when you know that the output file allocation will differ substantially
  from the total input file allocation (for example, when reformatting
  data or omitting records).  The /ALLOCATION qualifier is required if
  the /CONTIGUOUS qualifier is used.  Example
 
  
    
       
      
$ SORT/KEY=(POS:1,SIZ:80) STATS.DAT -
_$ SUMMARY.LIS/ALLOCATION=1000/CONTIGUOUS
 
 |   
     This SORT command allocates 1000 contiguous blocks for the output
    file SUMMARY.LIS.
 
/BUCKET_SIZE=n
 
  Specifies OpenVMS RMS bucket size (the number of 512-byte blocks per
  bucket) to be used by relative and indexed sequential output disk files
  for optimization. A value of 1 through 32 is allowed.  If the output
  file organization is the same as for the input files, the default value
  is the same as the bucket size of the first input file. If output file
  organization is different, the default value is 1.
   Example
 
  
    
       
      
$ SORT/KEY=(POS:1,SIZ:80) STATS1.DAT,STATS2.DAT -
_$ SUMMARY.LIS/BUCKET_SIZE=16/RELATIVE
 
 |   
     This SORT command results in the output file SUMMARY.LIS that has a
    bucket size of 16 with relative organization.
 
/CONTIGUOUS
 
  Requests that the output file be stored in contiguous disk blocks to
  decrease access time. Must be used with the /ALLOCATION qualifier. By
  default, Sort/Merge does not allocate contiguous disk blocks for the
  output file.  Example
 
  
    
       
      
$ SORT/KEY=(POS:1,SIZ:80) STATS.DAT -
_$ SUMMARY.LIS/ALLOCATION=1000/CONTIGUOUS
 
 |   
     This SORT command allocates 1,000 contiguous blocks for the output
    file SUMMARY.LIS.
 
/FORMAT=(type:n[,...])
 
  Specifies the output file record format (FIXED:n, VARIABLE:n, or
  CONTROLLED:n) if it differs from the input file format. You can also
  specify the size (SIZE:n) or the block size (BLOCK_SIZE:n) of the file
  records.  If the Sort operation is a record or tag sort, the default
  output record format is the same as the first input file record format.
  If the Sort operation is an address or index sort, the default output
  record format is fixed record format. If the input files have different
  record formats, Sort provides an output record size that is large
  enough to contain the largest record in the input files.  You can
  specify the following qualifier values.
  
    | 
BLOCK_SIZE:
      n
     | 
    
      Specifies the output file's block size, in bytes, if you have directed
      the file to magnetic tape. If the input file is a tape file, the block
      size of the output file defaults to that of the input file. Otherwise,
      the output file block size defaults to the size used when the tape was
      mounted.
     | 
   
  
    | 
       
     | 
    
Acceptable values for
      n range from 20 to 65,532. To ensure correct data interchange
      with other Digital systems, however, specify a block size of not more
      than 512 bytes. For compatibility with systems that are not made by
      Digital, the block size should not exceed 2,048 bytes.
     | 
   
  
    | 
CONTROLLED:
      n
     | 
    
      Specifies variable with fixed-length control (VFC) records in the
      output file.
     | 
   
  
    | 
FIXED:
      n
     | 
    
      Specifies fixed-length records in the output file.
     | 
   
  
    | 
SIZE:
      n
     | 
    
      Specifies the size, in bytes, of the fixed portion of VFC (CONTROLLED)
      records, up to a maximum of 255 bytes. If you do not specify SIZE, the
      default is the size of the fixed portion of the first input file. If
      you specify this size as 0, OpenVMS RMS defaults the value to 2 bytes.
     | 
   
  
    | 
VARIABLE:
      n
     | 
    
      Specifies variable-length records in the output file.
     | 
   
 
     For any qualifier value, you can optionally specify n as
    the maximum record size (in bytes) of the output records. The maximum
    record size allowed depends on the file organization:
  
    | 
      Sequential files
     | 
    
      32,767
     | 
   
  
    | 
      Relative files
     | 
    
      16,383
     | 
   
  
    | 
      Indexed-sequential files
     | 
    
      16,362
     | 
   
 
     These maximum record size values include control bytes for variable
    records with fixed-length control (VFC) format.
     Example
 
  
    
       
      
$ SORT/KEY=(POS:1,SIZ:80) STATS.DAT SUMMARY.LIS/FORMAT=FIXED:80
 
 |   
     The input file STATS.DAT consists of variable-length records that
    are 80 bytes in length. The /FORMAT qualifier specifies that the output
    file, SUMMARY.LIS, consists of fixed-length records.
 
/INDEXED_SEQUENTIAL
 
  Defines the file organization for the output file as indexed
  sequential. Note that the output file must already exist and must be
  empty. In addition, you must specify that the empty file is to be
  overlaid with the sorted records by using the /OVERLAY qualifier.
   Example
 
  
    
       
      
$ CREATE/FDL=NEW.FDL AVERAGE.DAT
$ SORT/KEY=(POS:1,SIZ:80) DATA.DAT,STATS.DAT -
_$ AVERAGE.DAT/INDEXED_SEQUENTIAL/OVERLAY
 
 |   
     The CREATE/FDL command creates the empty file AVERAGE.DAT. The SORT
    command specifies that the output file have an indexed-sequential
    organization and be written to the empty file AVERAGE.DAT.
 
/OVERLAY
 
  Specifies an existing empty file that the output file is to be overlaid
  on, or written to. The /OVERLAY qualifier is required when you use the
  /INDEXED_SEQUENTIAL qualifier.  If the input file organization is
  indexed-sequential, the output file must already exist and be empty. If
  the output file is not empty, /OVERLAY does not write over the file.
  Instead, it appends the result of the sort to the existing output file.
   You can use the CREATE/FDL utility to create an empty data file.
  Any attributes that you specify when creating the empty file then
  become attributes of the Sort output file.  Example
 
  
    
       
      
$ CREATE/FDL=NEW.FDL AVERAGE.DAT
$ SORT/KEY=(POS:1,SIZ:80) STATS.DAT AVERAGE.DAT/OVERLAY
 
 |   
     The FDL file NEW.FDL specifies special attributes for the file
    AVERAGE.DAT. When Sort writes output to that file, the resulting Sort
    output file has the attributes specified by the FDL file.
 
/RELATIVE
 
   Defines the file organization for the output file as relative.
    Example
 
  
    
       
      
$ SORT/KEY=(POS:1,SIZ:80) STATS.DAT SUMMARY.LIS/RELATIVE
 
 |   
     Because the input file STATS.DAT is not a relative file and the
    output file, SUMMARY.LIS, will be, /RELATIVE qualifies the output file
    specification.
 
/SEQUENTIAL
 
  Defines the file organization for the output file as sequential. This
  is the default for address and index sorting operations. The default
  for record and tag sorting operations is the organization of the first
  input file.  Example
 
  
    
       
      
$ SORT/KEY=(POS:1,SIZ:80) STATS.DAT SUMMARY.LIS/SEQUENTIAL
 
 |   
     Because the input file STATS.DAT is not a sequential file and the
    output file SUMMARY.LIS will be, /SEQUENTIAL qualifies the output file
    specification.
 
  
  
		 |