Compaq KAP C/OpenMP
for Tru64 UNIX
User Guide


Previous Contents Index


Appendix A
Data-Dependence Analysis

This appendix describes data-dependence analysis.

A.1 Data-Dependence Definitions

This section provides a brief explanation of data dependence --- the criterion that KAP uses to determine if a given loop can be optimized. KAP determines dependences between variables and arrays in loop iterations and bases decisions on this information automatically, informing you in the listing file only of those dependences that prevent optimization.

The study of data-dependence analysis was originally motivated by parallel (vector and concurrent) computation. The techniques and notations are applicable to advanced serial transformations for uniprocessor computers.

KAP uses a data-dependence graph that shows where data values are generated and where they are used within a loop or loop nest. The data-dependence graph is processed with standard graph traversal techniques to find potential problem areas, which appear as cycles in the graph. Each data-dependence cycle is carefully examined to see if it can be broken and the loop executed in parallel, or if part or all of the loop must be executed serially.

For an in-depth study of data dependence, consult the following:

A.2 Varieties of Data Dependence

The following describes three kinds of data dependence. The notation S1, S2, and SN is used to denote statement 1, statement 2, and statement N, respectively. In each example, S2 is dependent on S1 , due to variable x .

  1. Data dependence from an assignment to a use of a variable is called flow dependence or true dependence.


    (S1): x = 3; 
    (S2): y = x; 
    

  2. Data dependence from use of a variable to a later reassignment of that variable is called antidependence.


    (S1): y = x; 
    (S2): x = 3; 
    

  3. Data dependence from an assignment of a variable to a later reassignment of that variable is called output dependence.


    (S1): x = 3; 
          : 
    (SN): x = 4; 
    

A.3 Input and Output Sets

To determine data dependence, it is necessary to form the input and output sets for the given statements.

IN(S1) denotes the set of input items of S1 (items whose values may be read by S1 ). OUT(S1) denotes the set of output items (scalar variables or array elements) of statement S1 (items whose values may be changed by S1 ).

The following example shows the IN and OUT sets for the assignment statement in the loop:


for (i=1; i<=10; i++) 
  S1:  x[i] = a[i+1] * b; 
 
  IN[S1] = {a[2], a[3], a[4], ..., a[11], b} 
 
  OUT[S1] = {x[1], x[2], x[3], ..., x[10]} 

In practice, KAP often approximates the IN and OUT sets because the actual loop bounds are frequently unknown at compile time.

A.4 Data-Dependence Relations

For any two statements, S1 and S2 , one of the three types of data-dependence relations may be true, or the statements may be data independent.

If some item X is in OUT(S1) and X is in IN(S2) and S2 is to use the value of X computed in S1 , then S2 is flow dependent on S1 , as in example 1 in Section A.2.

If some item X is in IN(S1) and X is in OUT(S2) , but S1 is to use the value of X before it is changed by S2 , then S2 is antidependent on S1 , as in example 2 in Section A.2.

If some item X is in OUT(S1) and X is in OUT(S2) and the value computed by S2 is to be stored after the value computed by S1 is stored, S2 is output dependent on S1 , as in example 3 in Section A.2.

Antidependence and output dependence relations are sometimes inadvertently caused by programmers' coding practices. These dependences can often be removed by more careful coding.

A.5 Data-Dependence Direction Vectors

The direction vector is defined as a sequence of direction vector elements (one element for each loop enclosing both statements involved in the dependence arc ). The following symbols are direction vector elements:

<, =, >, <=, =>, /, *

Consider the following loop:


Loops        Line 
+---------   10      for ( i=1; i<n ; i++) { 
|            11       a[i] = b[i] + c[i]; 
|            12       c[i] = a[i-1] - 1; 
|_________   13       } 

If line 12 in iteration I" depends on line 11 in iteration I' , the element of the direction vector for loop I is as follows:
Direction
vector
element
When
< I' must be < I"
= I' must be = I"
> I' must be > I"
<= I' must be < or = I"
=> I' must be > or = I"
/ I' must not = I"
* no relation between I' and I" can be proven

In the previous example, the dependence for variable a has a direction vector of < , because the dependence flows from iteration I' to iteration I'+1 and I' < I'+1 . For example, the dependence on a[1] flows from iteration 1 to iteration 2, and 1 < 2. The data dependence for the variable c has a direction vector of = because the dependence stays in the same iteration of the loop (from iteration I' to iteration I' ).

A.6 Loop-Carried Dependence

A dependence is said to be carried by a loop if the corresponding direction vector element for that loop has a directional component (<, >, /, or *). Loop-carried dependence is an important concept for discovering when the iterations of the loop can be executed concurrently. If there are no loop-carried dependences, all iterations of that loop can be executed in parallel without synchronization.

A.7 Data-Dependence Example


for (i=1; i<=n; i++)  { 
    a[i] = b[i] + 2; 
    c[i] = a[i+1] + d[i]; 
 } 

This loop cannot be vectorized or concurrentized directly. An antidependence on a exists from the second assignment statement to the first. If this loop were directly concurrentized, some executions of the first statement could precede those of the second, and the antidependence would be violated. The values a[2] through a[n] would be incorrect.

If the two statements were interchanged, the loop could be vectorized directly, or distributed with each new loop concurrentized.


Index Contents