Previous | Contents | Index |
This appendix provides a brief explanation of data dependence.
B.1 Data Dependence Definitions
Data dependence is the criterion that KAP uses to determine if a given loop can be optimized. KAP tries to ensure that the optimized program produces the same results as the scalar input. For some loops, certain optimizations can result in incorrect answers; in these cases KAP detects and informs you of the dependencies that can interfere with KAP transformations.
Data dependence analysis was originally motivated by vector and parallel computation. The basic tools and notations apply to advanced serial optimizations on uniprocessor computers.
KAP determines data dependence and bases decisions on such information automatically, informing you only of those dependencies preventing optimization.
KAP uses a data dependence graph that shows where data values are generated and where they are used within a DO loop or DO loop nest. The data dependence graph is processed (with standard graph traversal techniques) to find potential problem areas, which appear as cycles in the graph. Each data dependence cycle is carefully examined to see if it can be broken and the loop transformed, or if part or all of the loop must be executed in its original form.
For an in-depth study of data dependence, consult the following:
The three kinds of data dependence are described in the following way. The notation S1 , S2 , and SN is used to denote statement 1, statement 2, and statement N , respectively. In each example, S2 is dependent on S1 , due to variable X .
1. Data dependence from an assignment to a use of a variable is called flow dependence (or true dependence):
(S1): X = 3 (S2): Y = X |
2. Data dependence from use of a variable to a later reassignment on that variable is called antidependence:
(S1): Y = X (S2): X = 3 |
3. Data dependence from an assignment of a variable to a later reassignment of that variable is called output dependence:
(S1): X = 3 : (SN): X = 4 |
To determine data dependence, it is necessary to form the input and output sets for the given statements.
IN(S1) denotes the set of input items of S1 (items whose values may be read by S1 ). OUT(S1) denotes the set of output items (scalar variables or array elements) of statement S1 (items whose values may be changed by S1 ).
The IN and OUT sets for the assignment statement in the loop are shown in the following example:
DO 10 I = 1,10 S1: X(I) = A(I+1) * B 10 CONTINUE IN(S1) = {A(2), A(3), A(4), ..., A(11), B} OUT(S1) = {X(1), X(2), X(3), ..., X(10)} |
In practice, KAP often approximates the
IN
and
OUT
sets because the actual loop bounds are frequently unknown at compile
time.
B.4 Data Dependence Relations
For any two statements, S1 and S2 , one of the three types of data dependence relations may be true, or the statements may be data independent.
If some item X is in OUT(S1) and X is in IN(S2) and S2 is to use the value of X computed in S1, S2 is flow dependent on S1 , as in example 1 in Section B.2.
If some item X is in IN(S1) and X is in OUT(S2) , but S1 is to use the value of X before it is changed by S2 , S2 is antidependent on S1 , as in example 2 of Section B.2.
If some item X is in OUT(S1) and X is in OUT(S2) and the value computed by S2 is to be stored after the value computed by S1 is stored, S2 is output dependent on S1 , as in example 3 of Section B.2.
Antidependence and output dependence relations are sometimes
inadvertently caused by programmers' coding practices. These
dependencies can often be removed by more careful coding.
B.5 Data Dependence Direction Vectors
The direction vector is defined as a sequence of direction vector elements (one element for each DO loop enclosing both statements involved in the dependence arc). The following symbols are direction vector elements:
<, =, >, <=, =>, /, *
Consider the following loop:
DO Loops Line +--------- 10 DO 10 I = 1,N-1 | 11 A(I) = B(I) + C(I) | 12 C(I) = A(I-1) - 1 |_________ 13 10 CONTINUE |
If line 12 in iteration I" depends on line 11 in iteration I', the element of the direction vector for loop I is as follows:
Direction vector element |
When |
---|---|
< | I' must be < I" |
= | I' must be = I" |
> | I' must be > I" |
<= | I' must be < or = I" |
=> | I' must be > or = I" |
/ | I' must not = I" |
* | no relation between I' and I" can be proven |
In the previous example, the dependence from
A(I)
to
A(I-1)
has a direction vector of <, because the dependence flows from
iteration
I'
to iteration
I'+1
and
I' < I'+1
. The data independence from
C(I)
to
C(I)
has a direction vector of = because the dependence stays in the same
iteration of the loop (from iteration
I'
to iteration
I'
).
B.6 Loop-Carried Dependence
A dependence is said to be carried by a loop if the corresponding
direction vector element for that loop has a directional component
(<, >, /, or *). Loop-carried dependence is an important concept
for discovering when the iterations of the loop can be executed
concurrently. If there are no loop-carried dependencies, all iterations
of that loop can be executed in parallel without synchronization.
B.7 Data Dependence Examples
This loop can easily be parallelized, because there are no loop-carried dependencies:
DO 10 I = 1,N A(I) = B(I) + 2 C(I) = A(I) + D(I) 10 CONTINUE |
The next loop cannot be parallelized directly. An antidependence on A exists from the second assignment statement to the first. Addition of synchronization is needed to parallelize the loop. (That, in turn, may slow execution.)
DO 10 I = 1,N A(I) = B(I) + 2 C(I) = A(I+1) + D(I) 10 CONTINUE |
Previous | Next | Contents | Index |