Previous | Contents | Index |
Compaq KAP transforms Fortran 90 source programs so that, when compiled and linked, they execute as multithreaded processes. These threads can run simultaneously --- that is, in parallel --- on symmetric multiprocessor (SMP) systems. The result is a program whose start-to-finish time is less than a Fortran 90 program that does not execute as a multithreaded process. More specifically, at run time the instructions from DO loops in a transformed Fortran 90 program execute in parallel mode. Parallelization is the process that transforms DO loops into instructions in an executable file file that execute as multithreaded processes.
DO loops occur both explicitly and implicitly. The familiar sequence
DO 10 I = . . . 10 CONTINUE |
forms an explicit DO loop that Compaq KAP views as a candidate for parallelization. A statement such as A = B * C, where A and B and C are arrays according to Fortran 90 syntax, is also viewed by Compaq KAP as a candidate for parallelization. Compaq KAP transforms this matrix multiplication statement into explicit DO loops. In general, any statement that Compaq KAP can transform into DO loops is an implicit DO loop.
Compaq KAP considers all DO loops in a program as candidates for parallelization. Each loop is or is not parallelized according to:
This chapter describes the three basic methods of controlling parallel processing (automatic, directed, and combination). It explains, for each method, how to:
Compaq KAP provides three methods for programmers to control parallel processing:
Use this method for programs that do not contain either OpenMP (!$OMP) or Parallel Computing Forum (!*KAP*) directives.
Compaq KAP automatically looks at these programs' DO loops and statements with Fortran 90 arrays. If these loops are good candidates for parallelization, then Compaq KAP transforms them so that they will be executed by multiple threads. This is the recommended method for initial experiences with parallelization, because the other methods require detailed knowledge of parallel programming concepts and implementation statements. Also, Compaq KAP sets the compiler and linker switches correctly.
Section 3.4 shows how to direct Compaq KAP to perform automatic parallelization of your program. An example of using KAP automatic detection, selection, and transformation of loops is giving the following command line for Fortran source program my_prog.f :
kf90 -fkapargs='-concurrent' my_prog.f |
The results include a transformed source program and its processing by the compiler and linker to create executable file a.out . The transformed source file will contain OpenMP directives for the loops that Compaq KAP has automatically decided to parallelize when the -psyntax=openmp switch is set. The OpenMP directives are passed onto the Compaq Fortran compiler for processing.
Parallel Computing Forum (!*KAP*) directives in your Fortran source
files will cause KAP to generate warning messages about the !*KAP*
directives. The !*KAP* directives will be transformed into OpenMP
(!$OMP) directives provided that the !*KAP* directives have correct
syntax.
3.2.2 Directed Method
Use this method for programs that contain parallel directives and for which you want only the loops surrounded by parallel directives to be parallelized. These directives explicitly control where and when Compaq KAP performs parallelization inside your program. These directives can be standard OpenMP ones; they begin with !$OMP.
The other source of directives is the Compaq KAP implementation of the X3H5 standard that was produced by the Parallel Computing Forum (PCF). These directives begin with !*KAP*.
Section 3.5, Directed Parallelization Using the kf90 Driver and OpenMP Directives shows how to use KAP to perform directed parallelization of your program. An example of using KAP-directed detection and transformation of loops is giving the following command line for Fortran source program my_prog.f containing OpenMP directives:
kf90 -fkapargs='-noconc' my_prog.f -omp -pthread -call_shared |
The results include a transformed source program and its processing by
the compiler and linker to create executable file
a.out
.
3.2.3 Combination Method
Use this method for programs where you control, with directives, the parallelization of selected individual loops in the source program file and want KAP to perform automatic detection, transformation, and parallelization of the remaining loops. Section 3.6 shows how to direct KAP to perform combined parallelization of your program. A possible command line for Fortran source program my_prog.f is:
kf90 -fkapargs='-concurrent' my_prog.f |
The results include a transformed source program and its processing by the compiler and linker to create executable file a.out . Because -psyntax=openmp is the default switch, KAP automatically parallelizes loops by inserting OpenMP directives. OpenMP directives inserted automatically by KAP and manually by the programmer are then processed by the compiler. The compiler switch -omp tells the compiler to recognize the OpenMP directives.
To view the compiler and linker switches that a kf90 command sets, include the -v switch. In the case of the automatic method, a command could be the following:
kf90 -fkapargs='-concurrent' my_prog.f -v |
setenv OMP_SCHEDULE (static,dynamic,guided,runtime) setenv OMP_DYNAMIC (true,false) default is false. setenv OMP_NESTED (true,false) default is false. setenv OMP_NUM_THREADS (number) default value is the number of processors on the current system. |
For more information on environment variables read by the Compaq Fortran compiler, see the Compaq Fortran User Manual for Tru64 UNIX and Linux Alpha Systems.
KAP provides the following parallel command-line switches, directives, and assertions for use with automatic parallel processing:
-chunk
-concurrent
-directives
-minconcurrent
-parallelio
-pdefault
-psyntax
-scheduling
!*$* [no]concurrentize
!*$* minconcurrent
!*$* assert concurrent call
!*$* assert do (concurrent)
!*$* assert do (concurrent call)
!*$* assert do (serial)
!*$* assert do prefer (concurrent)
!*$* assert do prefer (serial)
Control directives
Storage directives
Synchronization directives
Scheduling directives
Processor section directives
Parallel region directives
Parallel DO loop directives
Critical section directives
As a programmer, you should always remember that you implement a parallel processing method (automatic, directed, or combination) by making choices from the previous command-line options, directives, and assertions. Your choices affect the following actions:
For example, suppose you choose combination detection and parallelization for source program test_1.f90 . This program contains some or none of the parallel processing directives, parallel processing assertions, and OpenMP directives. Consider the following command:
kf90 -fkapargs='-concurrent -minconcurrent=1000 \ -psyntax=openmp -directives=akpv' test_1.f90 |
This command tells Compaq KAP to:
Compaq KAP parallel processing options, such as -concurrent , are enclosed in single quotation marks and are values of the -fkapargs option. The kf90 driver responds to the options enclosed in these single quotation marks by passing them as arguments to the kapf90 preprocessor (which actually transforms the source program file).
Several switches used in this example, -psyntax , -directives , and -minconcurrent , are set to their default values, and therefore you don't ordinarily have to set them explicitly to these values. |
The default values of the parallel processing options also control Compaq KAP loop detections, loop transformations, calling of the compiler and linker, and run-time scheduling. They are:
-directives=akpv -minconcurrent=1000 -noparallelio -pdefault=safe -psyntax=openmp (parallel directives of type OpenMP are generated) -scheduling=e -chunk=1 |
Read the explanations of each of the three methods of parallelization
in light of how your choices of options, directives, and assertions
affect Compaq KAP detection of loops, changes to loops, compiler and
linker behavior, and run-time behavior of the executable file
a.out
.
3.4 Automatic Parallelization Using the kf90 Driver
Parallelization by means of the automatic method is most useful for large programs under the following circumstances:
For programs with loops that have not already been explicitly parallelized using OpenMP directives, Compaq recommends that you use automatic parallelization.
An example of using KAP automatic detection, selection, and transformation of loops is giving the following command line for Fortran source program my_prog.f :
kf90 -fkapargs='-concurrent' my_prog.f |
The results include a transformed source program (
my_prog.cmp.f
) containing OpenMP directives around loops that KAP has decided to
parallelize, and processing of the transformed source by the compiler
and linker to create the executable file
a.out
.
3.4.1 Changing Source Programs
Compaq KAP cannot automatically parallelize loops with data dependencies between loop iterations and loops with calls to external routines. You can help Compaq KAP automatic detection of these loops by placing parallel processing assertions and parallel processing directives (each beginning with !*$* ) in the source program. These assertions and directives are:
!*$* assert concurrent call !*$* assert do (concurrent) !*$* assert do (concurrent call) !*$* assert do (serial) !*$* assert do prefer (concurrent) !*$* assert do prefer (serial) !*$* [no]concurrent !*$* minconcurrent |
Command-line switches you can give to Compaq KAP that affect its transformation of DO loops under the automatic method are:
To construct a program for parallel execution under the automatic method, you normally need to give only the -concurrent switch to the command, kf90 , that invokes the kf90 driver software as follows:
kf90 -fkapargs='-concurrent' my_prog.f |
The -concurrent (often abbreviated to -conc ) switch tells KAP to automatically restructure the DO loops for parallel processing. The -conc switch also sets the compiler and linker switches correctly. DO loops are parallelized by insertion of OpenMP directives because -psyntax=openmp is the default. The Fortran 90 compiler will in turn process these directives and implement the parallelization.
Finally, you may want to create a completely non-parallelized program so you can compare its execution time with the times of programs that are parallelized in various ways (such as the automatic method and the directed method). The following command does this:
kf90 -fkapargs='-noconc -directives=ak' -noomp myprog.f90 |
The
-noconc
switch prevents automatic parallelization of DO loops and the absence of
p
from the string following
-directives=
prevents Compaq KAP from responding to any parallel directive
statements. The
-noomp
switch prevents the Fortran compiler from responding to any OpenMP
parallel directive statements in the transformed source file it
receives.
3.5 Directed Parallelization Using the kf90 Driver and OpenMP Directives
Under the directed method, Compaq KAP does not do any automatic parallel detection. As always, any OpenMP directives in the original source program are passed to the Fortran 90 compiler for processing.
Parallelization by means of inserting OpenMP directives is most useful for programs under the following circumstances:
The directed method applies only to DO loops with parallel directives. Consider a program with the following loops:
!$OMP PARALLEL DO WHILE (I ... DO WHILE (J ... ... ... END DO END DO !$OMP END PARALLEL |
Compaq KAP passes the OpenMP directives of the DO WHILE I loop to the compiler for processing. Compaq KAP does not parallelize the DO WHILE J loop. So, "directed" means any loops not surrounded with parallel directive statements are not parallelized. If instead Compaq KAP were to attempt to transform both DO WHILE loops, then it would be running under the combination method.
An example of how to use KAP to process a program for which no automatic parallelization is desired is given below:
kf90 -fkapargs='-noconc' my_prog.f -omp -pthread -call_shared |
The results include a transformed source program and its processing by
the compiler and linker to create executable file
a.out
. Because of the
-noconc
switch, Compaq KAP does not automatically set compiler and linker
switches related to parallel processing. Therefore, the user must
explicitly set the
-omp
and
-pthread
compiler and
-call_shared
linker switches.
3.5.1 Changing Source Programs
Insert OpenMP directives (beginning with !$OMP ) only with loops that are safe to parallelize. When Compaq KAP sees a loop prefaced with OpenMP directives, it does not perform data dependence analysis on that loop and does not prevent you from using a parallel directive incorrectly. The OpenMP directives are described in the Compaq Fortran User Manual for Tru64 UNIX and Linux Alpha Systems. Table 3-1 correlates OpenMP directives to Cray's CMIC parallel directives:
OpenMP | Cray |
---|---|
Specifying Regions of Parallel Execution | |
!$OMP PARALLEL | CMIC$ PARALLEL |
!$OMP END PARALLEL | CMIC$ END PARALLEL |
Specifying Parallel Loops | |
!$OMP DO | CMIC$ END DO |
Specifying Synchronized Sections of Code | |
!$OMP CRITICAL | CMIC$ GUARD |
!$OMP END CRITICAL | CMIC$ END GUARD |
!$OMP SINGLE | (equivalent coded with guard flag) |
!$OMP END SINGLE | |
Specifying Sections of Code for Parallel Execution | |
!$OMP SECTIONS | CMIC$ CASE |
!$OMP SECTION (before each section) | |
!$OMP END SECTIONS | CMIC$ END CASE |
Controlling Subroutines Called within Parallel Regions | |
C*$* ASSERT CONCURRENT CALL | CMIC$ CONTINUE |
Specifying Commons Local to each Thread | |
!$OMP THREAD PRIVATE | CMIC$ TASKCOMMON |
No Compaq KAP switches affect the processing by the compiler of OpenMP
directives inserted by the user.
3.5.3 Directing the Compilation and Linking Process
To parallelize a program containing OpenMP directives, you normally need to give only the kf90 command with the -noconc KAP switch, the -omp and -pthread Fortran 90 compiler switches, and -call_shared linker switch.
An example follows:
kf90 -fkapargs='-noconc' myprog.f90 -omp -pthread -call_shared |
Because of the
-noconc
switch, Compaq KAP does not automatically set the compiler and linker
switches needed for parallelization. Correct ones appear here.
3.6 Combined Automatic and Directed Parallelization Using the kf90 Driver
Parallelization by the combined method is most useful for large programs in which you want to explicitly control the parallelization of some DO loops by inserting parallel directives while letting Compaq KAP automatically parallelize the remaining loops. The combined method is a merge of the automatic and directed methods.
The combined method applies to DO loops both with and without parallel directives. Consider a program with the following loops:
!$OMP PARALLEL DO WHILE (I ... DO WHILE (J ... ... ... END DO END DO !$OMP END PARALLEL |
The DO WHILE I loop is surrounded by parallel directives that have been inserted by the programmer. These directives will be passed on unmodified to the compiler.
The DO WHILE J loop is not surrounded by parallel directives. Compaq KAP first examines the J loop according to data dependency tests and the value of the -minconcurrent switch. If the J loop meets these requirements, Compaq KAP will parallelize the loop by inserting OpenMP directives which will then be passed on to the compiler. The appropriate command line to use to process this program is:
kf90 -fkapargs='-concurrent' my_prog.f
3.6.1 Changing Source Programs
You insert OpenMP directives around those DO loops that you want to explicitly parallelize.
In addition, you can insert guiding assertions around loops that you want to help Compaq KAP to parallelize automatically. Compaq KAP cannot automatically parallelize loops with data dependencies between loop iterations and loops with calls to external routines. You can help Compaq KAP detection of these loops by placing parallel processing assertions and parallel processing directives (each beginning with !*$* ) in the source program. These assertions and directives are:
!*$* assert concurrent call !*$* assert do (concurrent) !*$* assert do (concurrent call) !*$* assert do (serial) !*$* assert do prefer (concurrent) !*$* assert do prefer (serial) !*$* [no]concurrent !*$* minconcurrent |
Command-line switches you can give to Compaq KAP that affect its transformation of DO loops are:
To construct a program for parallel execution via the combined method, you normally need to give only the -concurrent switch to the kf90 command as follows:
kf90 -fkapargs='-concurrent' my_prog.f |
The -concurrent switch tells KAP to automatically parallelize appropriate DO loops. The -concurrent switch also sets the compiler and linker switches needed for parallelization. Because -psyntax=openmp is set by default, KAP inserts OpenMP directives around loops that it automatically detects are good candidates for parallelization. The actual parallelization is done by the compiler which processes the OpenMP directives inserted automatically by KAP and the OpenMP directives inserted by the programmer.
Finally, you may want to create a completely non-parallelized program so you can compare its execution time with the times of programs that are parallelized in various ways (such as the automatic method and the directed method). The following command does this:
kf90 -fkapargs='-noconc -directives=ak' -noomp myprog.f90 |
The
-noconc
switch prevents automatic parallelization of DO loops and the absence of
p
from the string following
-directives=
prevents Compaq KAP from responding to any parallel directive
statements. The
-noomp
switch prevents the Fortran 90 compiler from responding to any parallel
directive statements in the transformed source file it receives.
3.7 Compiling a Program for Parallel Execution Using kapf90
Normally, you use the kf90 command with the -conc switch to create an optimized and parallelized executable file. Compaq recommends this command because it sets the compiler and linker switches correctly. To view these switches, include the -v switch with the kf90 command. |
To compile a program for parallel execution using the kapf90 command on Tru64 UNIX, issue the following commands:
kapf90 -conc -cmp=myprog_mp.f90 myprog.f90 f90 myprog_mp.f90 -fast -tune host -automatic -omp -pthread |
The kapf90 command preprocesses myprog.f90 to produce a new source file, myprog_mp.f90 , which contains OpenMP directives inserted by Compaq KAP for loops Compaq KAP has selected for automatic parallelization. The file myprog_mp.f90 is then processed by the compiler and linker to produce a parallelized executable, a.out . Further explanation of the switches used follows:
To run a program parallelized with OpenMP directives ( -psyntax=openmp ), you must set the following environment variables:
OMP_SCHEDULE (static,dynamic,guided,runtime) OMP_DYNAMIC (true,false) default is false. OMP_NESTED (true,false) default is false. OMP_NUM_THREADS (number) default value is the number of processors on the current system. |
For more information on environment variables read by the Fortran 90
compiler, see the Compaq Fortran User Manual for Tru64 UNIX and
Linux Alpha Systems.
3.9 Parallel Programming Tips
kf90 -fkapargs='-conc' -v myprog.f |
-fuse --- see Section 4.7.9
-fuselevel=1 --- see Section 4.7.10
-ipa --- see Section 4.6.1
-ipa_from_files=<file>,<file> --- see Section 4.6.7
-ipa_optimize=2 --- see Section 4.6.11
Previous | Next | Contents | Index |