Previous | Contents | Index |
KAP directives enable, disable, or modify a feature of KAP. Directives function as command-line switches used within the input file rather than on the command line. To invoke a directive, you must either toggle the directive on, or set a desired value for its level.
With the exception of the !*$* assertions directive, using KAP directives will not affect the correctness of a program. You must be careful when you enable KAP assertions not to provide KAP with false information that might lead KAP to perform incorrect transformations. See Chapter 6, KAP Assertions.
Most KAP directives have corresponding command-line switches. If conflicting settings are given on the command line and in a directive, KAP uses the value specified on the directive. If command-line control is desired, directives can be disabled (treated as comments) with the -nodirectives switch.
The !*$* inline and !*$* ipa directives are disabled by default. When they are enabled, they take precedence over the inlining and IPA switches.
See Chapter 4 for the command-line switches related to these directives.
KAP recognizes the Fortran 90
CDEC$
directive. Except for the
cpar$ parallel do
directive, KAP copies them to the transformed code file but otherwise
ignores them. A
cpar$ parallel do
directive is copied to the transformed code file and the immediately
following loop nest is left unchanged.
5.2 Usage and Syntax of Directives
Directives placed inside a program unit remain in effect from the point of their appearance in the code until the end of the program unit. At the end of the program unit, the directive value defaults to the value set by the command-line switch. You can temporarily override directives in a program unit by partitioning sections of code into directive blocks. Directive blocks allow you to do the following:
The following example shows how command-line switches, directives used outside directive blocks, and directives used inside directive blocks differ in their effective durations:
Command-line switch | | Program Unit A | | !*$* directive | | | | !*$* beginblock | | !*$* directive | | | loop | | | array | | | loop | | !*$*endblock | | | | !*$* beginblock | | !*$* directive | | | loop | | | array | | | loop | | !*$*endblock | | | -> End | | Program Unit B | | loop A | | !*$* directive | | | | loop B | | | | !*$* beginblock | | !*$* directive | | | loop | | | array | | | loop | | !*$*endblock | | | -> End | ->End of source file |
Loop directives used inside directive blocks must immediately follow the !*$* beginblock directive. Loop directives used outside of directive blocks must immediately precede the loop they are to affect. In the following example, KAP issues an error message when the loop optimization directive !*$* OPTIMIZE(0) does not immediately precede the DO loop:
!*$* OPTIMIZE( 0 ) nbeg = M-((nn-1)*nstep) nend = max(1,nbeg-nstep+1) DO I=1,N colonne(1,i,nn)=c(1,i,nbeg) colonne(5,i,nn)=c(5,i,nbeg) colonne(8,i,nn)=c(8,i,nbeg) ENDDO |
Error message:
### !*$* OPTIMIZE( 0 ) ### in line 6 procedure MOVE2D_PRE of file directive.f90 ### This directive is not adjacent to the loop it applies to. 0 errors in file directive.f KAP -- Syntax Warnings Detected |
To correct the error, !*$* OPTIMIZE( 0 ) should immediately precede DO I=1,N .
To use a loop directive with an array, you must enclose the array in a directive block. In the following example, KAP issues an error message when it sees the parallel loop directive !*$* MINCONCURRENT (999999) and the array ddx outside a directive block:
!*$* MINCONCURRENT ( 999999 ) ddx(2:I-1,1:J) = array(3:I,1:J)-array(1:I-2,1:J) ddx(1,1:J) = 2*(array(2,1:J)-array( 1,1:J)) ddx(I,1:J) = 2*(array(I,1:J)-array(I-1,1:J)) |
Error message:
### !*$* minconcurrent (999999) ### in line 246 procedure DDX of file channel90.f90 ### This directive does not apply to any loop and has been ignored. 0 errors in file channel90.f90 KAP -- Syntax Warnings Detected |
To correct the error, enclose the directive and the array with !*$* beginblock and !*$* endblock .
You begin KAP directives with either !*$* or C*$* based on whether the source file is free or fixed format. In free format source files, precede directives with !*$* . Use C*$* in fixed format source files. When the kf90 driver sees a fixed format directive, for example, C*$* minconcurrent (999999) , inside a free format source file, channel90.f90 , KAP issues an error message, as follows:
kf90 channel90.f90 ### C*$* minconcurrent (999999) ### in line 246 procedure DDX of file channel90.f90 ### Statement unrecognizable as any known statement type. |
To correct the error precede the minconcurrent directive with !*$* .
Table 5-1 lists KAP directives using free-format syntax that begins each directive with !*$* . Use C*$* in fixed-format source files.
Directive | Range of Values |
---|---|
General Optimization | |
!*$* arclimit(n) Section 5.3.1 | (0-5000) |
!*$* beginblock Section 5.3.2 | <directive block> |
!*$* endblock | |
!*$* each_invariant_if_growth(n) Section 5.3.3 | (0-5000) |
!*$* limit(n) Section 5.3.4 | (=> 0) |
!*$* max_invariant_if_growth(n) Section 5.3.5 | (0-50000) |
!*$* optimize(n) Section 5.3.6 | (0-5) |
!*$* roundoff(n) Section 5.3.7 | (0-3) |
!*$* scalar optimize(n) Section 5.3.8 | (0-3) |
!*$* unroll(n1[,n2]) Section 5.3.9 | (=> 0 [, => 0] ) |
Parallel Processing Directives to Guide Automatic Parallelization | |
!*$* [no]concurrentize Section 5.4.1 | on/off |
!*$* minconcurrent Section 5.4.2 | (>0) |
Inlining and IPA | |
!*$* [no]inline [scope] [(name)] Section 5.5.1 | on/off |
!*$* [no]ipa [scope] [(name)] Section 5.5.2 | on/off |
Assertions | |
!*$* [no]assertions Section 5.6.1 | on/off |
Memory Management | |
!*$* padding (names) (var-list) Section 5.7.1 | |
!*$* storage order (names) (var-list) Section 5.7.2 |
The following sections explain the general optimization directives. The
name of the directive is followed by its range of values in parentheses.
5.3.1 !*$* arclimit (0--5000)
The
arclimit
value sets the maximum size of a data structure that KAP uses for data
dependence analysis. A larger number means that KAP can keep more
information on complex loop nests. The maximum value is 5000. The
default value is 5000.
5.3.2 !*$* beginblock <directive block> !*$* endblock
Use the
!*$* beginblock
and
!*$* endblock
directives to enclose a segment of source code, called a directive
block, where you can declare KAP directives. Place directives
immediately following the
!*$* beginblock
directive. Directives in a directive block remain active until the end
of the block and override directives used outside the block.
Compaq KAP Fortran/OpenMP preprocesses arrays by transforming them into Fortran 77 DO loop syntax. If you want to use KAP loop directives and assertions with arrays, you must enclose the arrays in directive blocks.
For example, in the following directive block, KAP first transforms the three arrays into Fortran DO loops. Next, KAP applies the !*$* minconcurrent (999999) directive to the loops.
!*$* BEGINBLOCK !*$* MINCONCURRENT ( 999999 ) ddx(2:I-1,1:J) = array(3:I,1:J)-array(1:I-2,1:J) ddx(1,1:J) = 2*(array(2,1:J)-array( 1,1:J)) ddx(I,1:J) = 2*(array(I,1:J)-array(I-1,1:J)) !*$* ENDBLOCK |
Do not nest directive blocks.
5.3.3 !*$* each_invariant_if_growth (0--5000)
When a loop contains an IF statement whose condition does not change from one iteration to another, the same test must be repeated for every iteration. The code can often be made more efficient by floating the IF outside the loop and putting the THEN and ELSE sections into their own loops.
This gets more complicated when there is other code in the loop, because a copy of it must be included in both the THEN and ELSE loops. The !*$* each_invariant_if_growth directive allows you to limit the total additional lines of code generated through invariant-IF restructuring in each loop.
This can be controlled globally with the -each_invariant_if_growth command-line switch. The maximum amount of additional code generated in a program unit through invariant-IF floating can be limited with the -max_invariant_if_growth switch and directive (see Section 5.3.5).
This directive is in effect to the end of the routine, or until it is
reset by a succeeding directive of the same type.
5.3.4 !*$* limit (=> 0)
The !*$* limit directive sets the maximum effort KAP will expend on optimizing a loop nest.
The value is a dial to control how much time KAP is to spend optimizing
loops; a larger number means that KAP tries to optimize more deeply
nested loop structures. The default value is 10, and the maximum
allowed value is 20000. The
-limit
command-line switch sets this first value globally. Setting
-limit=0
turns off loop optimization, allowing dusty-deck and other
straight-line transformations only.
5.3.5 !*$* max_invariant_if_growth (0--50000)
When a loop contains an IF statement whose condition does not change from one iteration to another, the same test must be repeated for every iteration. The code can often be made more efficient by floating the IF outside the loop and putting the THEN and ELSE sections into their own loops.
This gets more complicated when there is other code in the loop, because a copy of it must be included in both the THEN and ELSE loops. The !*$* max_invariant_if_growth directive allows you to limit the total additional lines of code generated through invariant-IF restructuring in each program unit.
This can be controlled globally with the -max_invariant_if_growth command-line switch. The maximum amount of additional code generated in a single loop through invariant-IF floating can be limited with the -each_invariant_if_growth switch and directive.
This directive is in effect to the end of the routine, or until it is reset by a succeeding directive of the same type, for example:
!*$*each_invariant_if_growth(<integer>) !*$*max_invariant_if_growth(<integer>) DO I = ... !*$*each_invariant_if_growth(<integer>) !*$*max_invariant_if_growth(<integer>) DO J = ... !*$*each_invariant_if_growth(<integer>) !*$*max_invariant_if_growth(<integer>) DO K = ... section-1 IF ( ) THEN section-2 ELSE section-3 ENDIF section-4 ENDDO ENDDO ENDDO |
In floating the invariant-IF out of the loop nest, the constraints set
by the innermost directives are honored first. If those constraints are
satisfied, then the invariant-IF is floated from the inner loop. The
middle pair of directives is tested and the invariant-IF will be
floated from the middle loop provided that there is no violation of the
restrictions established by these directives. The process of floating
continues as long as the directive constraints are satisfied.
5.3.6 !*$* optimize (0--5)
The optimize directive sets the optimization level, ranging from 0 for minimum optimization to 5 for maximum optimization. You can set the optimization level globally by using the -optimize=<integer> command-line switch. The following shows the meaning of each of the different optimization levels:
Value | Meaning |
---|---|
0 | KAP does not perform loop optimizations. |
1 | KAP performs only simple analysis and loop optimizations. |
2 | DO loop interchanging techniques are applied. Lifetime analysis is performed to determine when last-value assignment of scalars is necessary. More powerful data dependence tests are used. |
3 | KAP distributes loops to optimize only a part of a loop. Special techniques are used to break data dependence cycles that otherwise prevent optimization. More loop interchanging is attempted, such as interchanging of triangular loops. Special-case data dependence tests are used. Special index sets, called wraparound variables, are recognized to uncover more opportunities for optimization. |
4 | Two versions of a loop are generated, if necessary, to break a data dependence arc. Loop interchanging around reductions is attempted. More exact data dependence tests are allowed. |
5 | Array expansion and loop fusion are enabled. |
A higher optimization level results in more optimization, along with
increased compilation time. Many programs that are written to be easily
optimized do not need advanced transformations; with these programs, a
lower optimization level will suffice.
5.3.7 !*$* roundoff (0--3)
The roundoff directive allows you to specify the amount of difference in roundoff error that is acceptable. Certain reductions are sensitive to the algorithms used to compute them. In particular, if an arithmetic reduction is accumulated in a different order than in the scalar program, the roundoff error is accumulated differently and the final result may differ from that of the original program's output. While the difference is usually insignificant, some restructuring transformations performed by KAP must be disabled to get precisely the same results as the scalar program.
KAP classifies its transformations by the amount of difference in roundoff error that can accumulate. You can decide what level of roundoff error differences is allowable. The roundoff directive value ranges from 0 to 3.
The meaning of each roundoff level is as follows:
Value | Meaning |
---|---|
0 | Allow no roundoff-changing transformations. |
1 | Enable expression simplification and code floating. Allow loop interchanging around serial arithmetic reductions. Allow loop rerolling, if -scalaropt > 1. |
2 | Enable reciprocal substitution. |
3 | Enable recognition of REAL induction variables. Enable memory management, if -scalaropt=3 . INTEGER division rotation is allowed. |
The
-roundoff
command-line switch acts like a global
!*$* roundoff
directive.
5.3.8 !*$* scalar optimize (0--3)
The !*$* scalar optimize directive sets the level of dusty-deck and other serial transformations performed. Unlike the -scalaropt command-line switch, the !*$* scalar optimize directive sets the level of loop-based optimizations (for example, loop fusion) only, and not straight-code optimizations (for example, dead-code elimination).
The meaning of each scalar optimize level is as follows:
Value | Meaning |
---|---|
0 | No transformations are performed. |
1 | IF loops are changed into DO loops. Simple code floating out of loops is performed. Forward substitution of variables is performed. |
2 | The full set of loop-based serial transformations is enabled. These include induction variable recognition, loop rerolling, loop unrolling, loop fusion, and array expansion. |
3 | Memory management is enabled, if -roundoff=3 . |
The !*$* unroll directive tells KAP how to unroll, that is, replicate the text of, innermost loops. Outer loop unrolling is part of memory management.
The loops are unrolled according to a formula that counts the number of array references and arithmetic operations in the loop. KAP unrolls the loop until that value equals the <integer> parameter or the number of unrolled iterations reaches the <#it> parameter. The -unroll and -unroll2 command-line switches act like a global !*$* unroll directive.
The -scalaropt level must be set at 2 or higher for this directive to be enabled. |
The <#it> parameter is the maximum number of iterations to unroll. The =0 parameter uses default values to unroll. The =1 parameter means no unrolling. <weight> is the maximum weight in an unrolled loop. <weight> is estimated by counting operands and operators in a loop.
A scalar loop is unrolled until one of the limits is reached. See Chapter 4 and Chapter 8 for detailed examples.
The
!*$* unroll
directive is valid only for the loop before which it appears.
5.4 Parallel Processing Directives for Automatic Parallelization
The following sections explain directives available in KAP that affect
KAP's automatic detection of candidate loops for parallelization.
5.4.1 !*$* [no]concurrentize
The !*$* concurrentize directive enables parallel execution of loops. The !*$* noconcurrentize directive disables parallel execution until the next !*$* concurrentize directive or until the beginning of the next program unit.
The !*$* [no]concurrentize directive overrides any -[no]concurrentize command-line switch. For example, if you specify the !*$* noconcurrentize directive, KAP disables parallel execution regardless of any -concurrentize command-line switch.
Two-version loops requiring conditional parallel execution may run more
slowly than their scalar originals due to the evaluation of the
condition. In these cases, you may prefer using either the
-noconcurrentize
command-line switch, if the program contains predominantly short loops,
or the
!*$* noconcurrentize
directive for specific loops.
5.4.2 !*$* minconcurrent (0--999999)
The
minconcurrent
directive sets the parallel execution threshold for KAP. Each
parallelizable DO loop with bounds unknown at compilation time becomes
a two-version loop. At run time KAP decides whether to execute the loop
in parallel or in serial mode. The higher the
minconcurrent
value the more iterations and/or statements the loop nest must have in
order to be run in parallel.
5.5 Inlining and IPA
The following sections explain the inlining and IPA directives.
5.5.1 !*$* [no]inline [here|routine|global] [(name [,name...])]
See Section 5.5.2.
5.5.2 !*$* [no]ipa [here|routine|global] [(name [,name...])]
The !*$* inline and !*$* ipa directives allow you to manually select which call sites of which routines are to be inlined or analyzed, respectively. The NO forms select CALLs and function references that are not to be inlined/analyzed, regardless of any -inline or -ipa command-line switch.
These directives are ignored by default. They are enabled when you specify any inlining or ipa command-line switch, respectively, on the command line. The -inline_manual and -ipa_manual command-line switches are provided to enable these directives without activating the automatic inlining and analysis algorithms.
The optional scope parameter sets how much of the program the directive applies to. HERE means the next statement only, ROUTINE means the rest of the program unit, and GLOBAL means the entire source file. If none of these is given, the directive applies only to the next statement.
The optional names are names of the subroutines and functions to which the directive applies. If no list is provided, the directive applies to all subroutine CALLs and function references within the scope of the directive.
See Chapter 7 for more information.
5.6 Assertions Directive
The following section explains the
assertions
directive.
5.6.1 !*$* [no]assertions
The !*$* assertions directive tells KAP to accept assertions. The !*$* no assertions directive tells KAP to ignore assertions. The !*$* no assertions directive disables assertions until the next !*$* assertions directive, or the end of the program unit.
Individual assertions are explained in Chapter 6.
5.7 Memory Management Directives
The following sections explain the memory management directives. These are output directives that KAP uses to pass information on data layout to the compiler or to KAP itself, if the program is processed iteratively. If a program is processed by KAP multiple times, KAP will use the information in the directives it inserted in previous runs in its cache usage optimizations.
If the <var-list> is too long for a single line, it can be continued by putting [c]*$*& starting in column 1 of the continuation line.
Few users will need to insert or modify these directives by hand.
5.7.1 !*$* padding (var-list)
The padding directive identifies the listed arrays and scalar variables as objects that KAP created for the purpose of data alignment. (See the -aggressive command-line switch, Section 4.7.1.) This directive is for KAP to use when a program is being reprocessed; it will be ignored by the compiler.
The following rules govern the !*$* padding directive:
In the following example, the !*$* padding directive identifies arrays that KAP created to keep the arrays P, PI, PF, K, and Q from causing cache collisions:
REAL FUNCTION EBREMS (ENRES) !*$* padding ( DD4, DD3, DD2, DD1 ) ... DOUBLE PRECISION DD1 (256), DD2 (251), DD3 (251), DD4 (251) ... COMMON /KINEM/PI, DD1, PF, DD2, P, DD3, K, DD4, Q |
The storage order directive specifies the relative order that storage should be allocated for the listed routine-local variables and arrays. By appropriately positioning the arrays, cache collisions can be reduced. If the compiler does not interpret the storage order directive, a loss of performance results, but the program will nonetheless generate the correct results.
The rules governing the use of the !*$* storage order directive are the following:
To interpret a !*$* storage order directive, the compiler must place the named objects in memory in the order listed. This is the same order as they would be placed in a COMMON block. Thus, on a machine with 4 bytes per REAL variable:
!*$* storage order (A1,A2,A3) REAL A1(100), A2(3), A3(200) A1 would be placed at some address (for example, address X) A2 would be placed at X+100*4 A3 would be placed at X+100*4+3*4 |
Both static and stack-based storage schemes are allowed, as long as all of the objects in a single storage order directive are placed in the same scheme.
Previous | Next | Contents | Index |