Previous | Contents | Index |
Use Parallel Computing Forum (PCF) directives only with loops that are safe to parallelize. When Compaq KAP sees loops prefaced with PCF directives, it does not perform data dependence analysis and does not prevent you from using a parallel directive incorrectly.
Observe the following rules:
The PARALLEL REGION and END PARALLEL REGION directives delineate where parallelism exists in the program. The following example shows the PARALLEL REGION directive syntax:
!*kap* PARALLEL REGION !*kap*& [ IF(logical expression) ] !*kap*& [ SHARED(shared_name,...) ] !*kap*& [ LOCAL(local_name,...) ] !*kap* END PARALLEL REGION |
In the syntax example, local_name and shared_name are references to a variable or an array. If the IF clause logical expression evaluates to .FALSE., all of the code between PARALLEL REGION and END PARALLEL REGION executes on a single processor. If the logical expression evaluates to .TRUE., the code between the PARALLEL REGION and the corresponding END PARALLEL REGION may execute on multiple processors.
The SHARED and LOCAL lists on the PARALLEL REGION directive state the
explicit forms of data sharing among the processors that execute the
code inside the parallel region. When distinct processors reference the
same variable or array from the SHARED list, the processors reference
the same storage location. When distinct processors reference the same
variable or array from the LOCAL list, the processors reference
distinct storage locations.
D.2 PARALLEL DO Directive
The PARALLEL DO directive tells KAP the next statement begins an iterative DO loop that can be executed using multiple processors. Each processor applied to the DO loop can execute one or more iterations.
The following syntax example shows the PARALLEL DO directive inside a PARALLEL REGION:
!*kap* PARALLEL REGION !*kap*& [ IF(logical expression) ] !*kap*& [ SHARED(shared_name,...) ] !*kap*& [ LOCAL(local_name,...) ] !*kap* PARALLEL DO !*kap*& [ STATIC ] !*kap*& [ LAST LOCAL(local_name,...) ] ] !*kap*& [ BLOCKED [ (integer constant expression) ] ] !*kap* END PARALLEL REGION |
In the syntax example, LAST LOCAL(local_name) creates a "local name" type variable that is used during execution of the PARALLEL DO loop. During the last PARALLEL DO loop iteration, the final value of local_name is copied into an identically named variable created by SHARED(shared_name) in the enclosing PARALLEL REGION. For example, the final value of variable LAST LOCAL(x) would be copied into variable SHARED(x) , as follows:
!*kap* PARALLEL REGION !*kap*& SHARED(x) !*kap* PARALLEL DO !*kap*& LAST LOCAL(x) DO 10 . . . 10 CONTINUE !*kap* END PARALLEL REGION |
Because the LAST LOCAL() variable inside the PARALLEL DO implies the SHARED() variable inside the PARALLEL REGION, it is legal to use both PCF directives.
If [ BLOCKED [ (integer constant expression) ] ] is specified in a PARALLEL DO, loop iterations are assigned to run-time "workers." A "worker" is a logical processor, that is, a processor, a process, or a thread in blocks of that size.
If
BLOCKED
is omitted in a PARALLEL DO directive, the default is even scheduling
where loop iterations are evenly divided among run-time
"workers." If
BLOCKED
is specified without a number, the default block size is 1.
D.3 DO Loop Example with PCF Directives
The following example shows the use of the PARALLEL REGION and the PARALLEL DO directives in a simple loop:
!*kap* PARALLEL REGION !*kap*& SHARED(A,B,C) LOCAL(I) !*kap* PARALLEL DO do 10 i=1,n a(i) = b(i) * c(i) 10 continue !*kap* END PARALLEL REGION |
The following program example shows the use of the PARALLEL REGION and the PARALLEL DO directives:
PROGRAM ATIMESB PARAMETER M=512, N=512, P=512 REAL time1,time2 REAL*8 A,B,C DIMENSION A(1:M,1:N), B(1:N,1:P), C(1:M,1:P) C Initialize the matrices !*kap* PARALLEL REGION SHARED (A,B) LOCAL (J,I) !*kap* PARALLEL DO DO 10 J=1,N DO 10 I=1,M A(I,J) = 1.5 10 CONTINUE !*kap* PARALLEL DO DO 20 J=1,P DO 20 I=1,N B(I,J) = 3.0 20 CONTINUE !*kap* END PARALLEL REGION C Compute C = A * B CALL CSETTIME() time1 = CTIMEC() CALL MATMUL(A, M, B, N, C, P) time2 = CTIMEC() write(*,*)'elapsed time in seconds is:',(time1-time2) END SUBROUTINE MATMUL(A, LDA, B, LDB, C, LL) REAL A(LDA,LDB), B(LDB,LL), C(LDA,LL) INTEGER LDA,LDB,LL !*kap* PARALLEL REGION SHARED (A,LDA,B,LDB,C,LL) LOCAL (J,K,I) !*kap* PARALLEL DO DO 20 J=1,LL DO 20 I=1,LDA C(I,J) =0.0 DO 20 K=1,LDB C(I,J) = C(I,J) + ( A(I,K) * B(K,J) ) 20 CONTINUE !*kap* END PARALLEL REGION RETURN END |
The CRITICAL SECTION and END CRITICAL SECTION directives define the
scope of a critical section. Exactly one logical processor at a time is
allowed inside a CRITICAL SECTION. This construct must be coded
lexically inside a PARALLEL REGION and END PARALLEL REGION.
D.6 ONE PROCESSOR SECTION Directive
The ONE PROCESSOR SECTION and END ONE PROCESSOR SECTION directives
define the scope of a section of code where exactly one processor is
allowed to execute the code. This directive must be coded lexically
inside a PARALLEL REGION and END PARALLEL REGION.
D.7 Comparison of KAP PCF and Cray Autotasking Directives
If you formerly used Cray autotasking to perform parallel decomposition, you can substitute KAP PCF directives, as shown in Table D-1.
KAP Parallel Computing Forum | Cray Autotasking |
---|---|
Specifying Regions of Parallel Execution | |
!*kap* PARALLEL REGION | CMIC$ PARALLEL |
!*kap* END PARALLEL REGION | CMIC$ END PARALLEL |
Specifying Parallel Loops | |
!*kap* PARALLEL DO | CMIC$ DO PARALLEL |
End defined by loop scope | CMIC$ END DO |
Specifying Synchronized Code Sections | |
!*kap* CRITICAL SECTION | CMIC$ GUARD |
End defined by loop scope | CMIC$ END GUARD |
!*kap* ONE PROCESSOR SECTION | |
!*kap* END ONE PROCESSOR SECTION | |
Specifying Code Sections for Parallel Execution | |
Equivalent coded with PARALLEL DO | CMIC$ END CASE |
Controlling Subroutines Called Within Parallel Regions | |
!*$* ASSERT CONCURRENT CALL | CMIC$ CONTINUE |
Unstructured Exits from Parallel Region | |
Not available currently | CMIC$ SOFT EXIT |
Equivalent coded with PARALLEL REGION with one loop optimization performed by KAP | CMIC$ DO ALL (end defined by loop) |
Previous | Next | Contents | Index |