Compaq KAP Fortran/OpenMP
for Tru64 UNIX
User Guide


Begin Index

Contents (summary)
Preface Preface
Chapter 1 What Is KAP?
Chapter 2 How to Run KAP
Chapter 3 KAP Parallel Processing
Chapter 4 KAP Command-Line Switches
Chapter 5 KAP Directives
Chapter 6 KAP Assertions
Chapter 7 Inlining and IPA
Chapter 8 Transformations
Chapter 9 KAP Listing File
Appendix A Compaq Fortran Extensions Supported by KAP Fortran/OpenMP
Appendix B Data Dependence Analysis
Appendix C OpenMP Examples
Appendix D PCF Directives
Appendix E KAP and Incorrect Programs
Appendix F Listing File Messages
  Index
  Tables


Contents


Preface
Preface Preface
Chapter 1
1 What Is KAP?
Chapter 2
2 How to Run KAP
     2.1     General KAP Information
     2.2     Installing Compaq KAP
     2.3     Compiling a Program Using the kf90 Driver
         2.3.1         Passing Default KAP Switch Settings to kf90
     2.4     Passing KAP Switches to kf90
         2.4.1         Passing Compaq Fortran Compiler Switches to kf90
         2.4.2         Additional Information About Using the kf90 Driver
     2.5     Compiling a Program Containing C Preprocessor Directives Using kf90
     2.6     Optimized Programs
     2.7     KAP Command-Line Switches Determined by Compiler Switches
     2.8     Compiling a Program Using kapf90
     2.9     Compiling a Program Containing C Preprocessor Directives Using kapf90
     2.10     Using KAP Syntax
     2.11     Using File Naming Conventions
     2.12     Guidelines for Optimizing With KAP
         2.12.1         Optimizing Small Programs with KAP
         2.12.2         Optimizing Large Programs with KAP
         2.12.3         General Optimization Tips
     2.13     Improving and Customizing KAP Performance
     2.14     Using Additional Performance Improvement Techniques
     2.15     Correcting KAP Problems
Chapter 3
3 KAP Parallel Processing
     3.1     Overview
     3.2     Parallel Processing Methods
         3.2.1         Automatic Detection Method
         3.2.2         Directed Method
         3.2.3         Combination Method
         3.2.4         Environment Variables to Set When -psyntax=openmp Is Selected
     3.3     Interaction of Parallel Processing Controls
     3.4     Automatic Parallelization Using the kf90 Driver
         3.4.1         Changing Source Programs
         3.4.2         Giving Command-Line Switches
         3.4.3         Directing the Compilation and Linking Process
     3.5     Directed Parallelization Using the kf90 Driver and OpenMP Directives
         3.5.1         Changing Source Programs
         3.5.2         Giving Command-Line Switches
         3.5.3         Directing the Compilation and Linking Process
     3.6     Combined Automatic and Directed Parallelization Using the kf90 Driver
         3.6.1         Changing Source Programs
         3.6.2         Giving Command-Line Switches
         3.6.3         Directing the Compilation and Linking Process
     3.7     Compiling a Program for Parallel Execution Using kapf90
     3.8     Running a Parallelized Program
     3.9     Parallel Programming Tips
Chapter 4
4 KAP Command-Line Switches
     4.1     Overview
     4.2     Switches for the kf90 Driver
         4.2.1         -fext, (-fext=f)
         4.2.2         -f90, (-f90=/usr/bin/f90)
         4.2.3         -f90kap, (-f90kap=/usr/bin/kapf90)
         4.2.4         -fkapargs
         4.2.5         -S, (off)
         4.2.6         -tmpdir, (-tmpdir=/tmp/)
         4.2.7         -tune, (-tune=<current system architecture>)
         4.2.8         -verbose, -v, (-nov)
     4.3     General Optimization Switches for kapf90
         4.3.1         -interchange, -nointerchange, (-interchange)
         4.3.2         -namepartitioning, -namepart, (-nonamepart)
         4.3.3         -optimize, -o, (-optimize=5)
         4.3.4         -recursion, -rec, -norec, (-norecursion)
         4.3.5         -roundoff, -r, (-r=3)
         4.3.6         -scalaropt, -so, (-scalaropt=3)
         4.3.7         -skip, -sk, -nsk, (-noskip)
         4.3.8         -tune, (-tune=<current system architecture>)
     4.4     Parallel Processing Switches for kapf90
         4.4.1         -chunk, (-chunk=1)
         4.4.2         -concurrent, -conc, -noconc, (-noconcurrent)
         4.4.3         -minconcurrent, -mc, (-mc=1000)
         4.4.4         -parallelio, -nopio, -pio, (-noparallelio)
         4.4.5         -pdefault, (-pdefault=safe)
         4.4.6         -psyntax, (-psyntax=openmp)
         4.4.7         -scheduling, -sched, (-sched=e)
     4.5     Fortran Dialect Switches for kapf90
         4.5.1         -align_common, (-align_common=8)
         4.5.2         -align_struct, (-align_struct=4)
         4.5.3         -assume, -a, (-assume=cel), -noassume, -na
         4.5.4         -datasave, -ds, -nodatasave, -nds, (-datasave)
         4.5.5         -dlines, -dl, -ndl, (-nodlines)
         4.5.6         -escape, (-noescape)
         4.5.7         -freeformat, -ff, -nff, (-nofreeformat)
         4.5.8         -integer, -int, (-int=4)
         4.5.9         -intlog, (-intlog)
         4.5.10         -kind, (-kind=4)
         4.5.11         -logical, -log, (-log=4)
         4.5.12         -onetrip, -1, (-n1), -noonetrip
         4.5.13         -real, -rl, (-rl=4)
         4.5.14         -save, -sv, (-sv=manual_adjust)
         4.5.15         -scan, (-scan=72)
         4.5.16         -syntax, -sy, (off)
         4.5.17         -type, -ty, -nty, (-notype)
     4.6     Inlining and Interprocedural Analysis Switches for kapf90
         4.6.1         -inline, -inl, -noinline, -ninl, (off) -ipa, -ipa, -noipa, -nipa, (off)
         4.6.2         -inline_and_copy, -inlc, (off)
         4.6.3         -inline_create, -incr, (off), -ipa_create, -ipacr, (off)
         4.6.4         -inline_depth, -ind, (-ind=2), -ipa_depth, -ipad, (-ipad=2)
         4.6.5         -inline_from_files, -inff, (current source file)
         4.6.6         -inline_from_libraries, -infl, (off)
         4.6.7         -ipa_from_files, -ipaff, (current source file)
         4.6.8         -ipa_from_libraries, -ipafl, (off)
         4.6.9         -inline_looplevel, -inll, (-inll=2), -ipa_looplevel, -ipall, (-ipall=2)
         4.6.10         -inline_manual, -inm, (off), -ipa_manual, -ipam, (off)
         4.6.11         -inline_optimize, (-inline_optimize=0), -ipa_optimize, (-ipa_optimize=0)
     4.7     Advanced Optimization Switches for kapf90
         4.7.1         -aggressive, -ag, -nag, (-noaggressive)
         4.7.2         -arclimit, -arclm, (-arclimit=5000)
         4.7.3         -cacheline, -chl, (-chl=64,64)
         4.7.4         -cache_prefetch_line_count, -cplc, (-cplc=0)
         4.7.5         -cachesize, -chs, (-chs=32,0)
         4.7.6         -dpregisters, -dpr, (-dpr=32)
         4.7.7         -each_invariant_if_growth, -eiifg, (-eiifg=20)
         4.7.8         -fpregisters, -fpr, (-fpr=32)
         4.7.9         -fuse, -nfuse, (-nofuse)
         4.7.10         -fuselevel, (-fuselevel=0)
         4.7.11         -generateh, -genh
         4.7.12         -hdir, -hd, (-hdir=<current directory>)
         4.7.13         -heaplimit, -heap, (-heaplimit=100)
         4.7.14         -hoist_loop_invariants, -hli, (-hli=1)
         4.7.15         -interleave, -intl, (-interleave)
         4.7.16         -library_calls, -lc, (off)
         4.7.17         -limit, -lm, (-lm=10)
         4.7.18         -machine, -ma, -nomachine, -noma, (-ma=s)
         4.7.19         -max_invariant_if_growth, -miifg, (-miifg=500)
         4.7.20         -routine, -rt, (off)
         4.7.21         -setassociativity, -sasc, (-sasc=1,1)
         4.7.22         -srlcd, -nsrlcd, (-nosrlcd)
         4.7.23         -tablesize, -ts, (-ts=24000000)
         4.7.24         -unroll, -ur, (-ur=4), -unroll2, -ur2, (-ur2=160), -unroll3, -ur3, (-ur3=1)
         4.7.25         -useh
     4.8     Directive Recognition Switches for kapf90
         4.8.1         -directives, -dr, -nodirectives, -ndr, (-directives=akpv)
         4.8.2         -ignoreoptions, -ig, -nig, (-noignoreoptions)
     4.9     Input-Output Switches for kapf90
         4.9.1         -cmp, (<file>.cmp.f90), (<file>.cmp.f), -nocmp, -ncmp
         4.9.2         -include, -inc, (off)
         4.9.3         -list, -l, -nl, (-list=<file>.out)
     4.10     Listing Switches for kapf90
         4.10.1         -cmpoptions, -cp, -ncp, (-nocmpoptions)
         4.10.2         -lines, -ln, (-ln=55)
         4.10.3         -listingwidth, -lw, (-lw=132)
         4.10.4         -listoptions, -lo, (-lo=klo)
         4.10.5         -suppress, -su, (off)
     4.11     !*$*options
Chapter 5
5 KAP Directives
     5.1     Overview
     5.2     Usage and Syntax of Directives
     5.3     General Optimization Directives
         5.3.1         !*$* arclimit (0--5000)
         5.3.2         !*$* beginblock <directive block> !*$* endblock
         5.3.3         !*$* each_invariant_if_growth (0--5000)
         5.3.4         !*$* limit (=> 0)
         5.3.5         !*$* max_invariant_if_growth (0--50000)
         5.3.6         !*$* optimize (0--5)
         5.3.7         !*$* roundoff (0--3)
         5.3.8         !*$* scalar optimize (0--3)
         5.3.9         !*$* unroll(<#it>[,<integer>])
     5.4     Parallel Processing Directives for Automatic Parallelization
         5.4.1         !*$* [no]concurrentize
         5.4.2         !*$* minconcurrent (0--999999)
     5.5     Inlining and IPA
         5.5.1         !*$* [no]inline [here|routine|global] [(name [,name...])]
         5.5.2         !*$* [no]ipa [here|routine|global] [(name [,name...])]
     5.6     Assertions Directive
         5.6.1         !*$* [no]assertions
     5.7     Memory Management Directives
         5.7.1         !*$* padding (var-list)
         5.7.2         !*$* storage order (var-list)
Chapter 6
6 KAP Assertions
     6.1     Overview
     6.2     Descriptions of KAP Assertions
         6.2.1         !*$* assert [no]argument aliasing
         6.2.2         !*$* assert [no]bounds violations
         6.2.3         !*$* assert [no]equivalence hazard
         6.2.4         !*$* assert [no]last value needed
         6.2.5         !*$* assert permutation
         6.2.6         !*$* assert no recurrence
         6.2.7         !*$* assert relation ( <name> .XX. <variable/constant>)
         6.2.8         !*$* assert no sync
         6.2.9         !*$* assert [no] temporaries for constant arguments
     6.3     Parallel Processing Assertions that Guide Automatic Parallelization
         6.3.1         !*$* assert concurrent call
         6.3.2         !*$* assert do (concurrent)
         6.3.3         !*$* assert do (concurrent call)
         6.3.4         !*$* assert do (serial)
         6.3.5         !*$* assert do prefer (concurrent)
         6.3.6         !*$* assert do prefer (serial)
Chapter 7
7 Inlining and IPA
     7.1     Inlining and IPA Command-Line Switches
         7.1.1         inline_from/ipa_from Switches
         7.1.2         Library Creation
         7.1.3         Naming Specific Routines
         7.1.4         DO Loop Level
         7.1.5         Recursive Inlining
         7.1.6         Manual Control
     7.2     Inlining and IPA Directives
     7.3     Listing File Support
         7.3.1         -listoptions=c
     7.4     Inlining/IPA Examples
         7.4.1         Inlining Example --- Same Source File
         7.4.2         Inlining Example with a Library
         7.4.3         IPA Example
         7.4.4         Recursive Inlining Examples
         7.4.5         Manual Inlining Example
         7.4.6         Notes on Inlining and IPA
     7.5     Conditions Inhibiting Inlining/IPA
Chapter 8
8 Transformations
     8.1     Memory Management
         8.1.1         Command-Line Switches
         8.1.2         Memory Management Tactics
     8.2     Serial Optimizations
         8.2.1         Dead-Code Elimination
         8.2.2         Induction Variable Recognition
         8.2.3         Global Forward Substitution
         8.2.4         Loop Peeling
         8.2.5         Lifetime Analysis
         8.2.6         Invariant-IF Restructuring
         8.2.7         Reciprocal Substitution
     8.3     Scalar (Dusty-Deck) IF Transformations
         8.3.1         IF to Block IF
         8.3.2         IF to DO Loop
         8.3.3         Semantic IF Merging
         8.3.4         Zero-Trip IF Removal
     8.4     Loop Unrolling
     8.5     Loop Rerolling
Chapter 9
9 KAP Listing File
     9.1     Listing Switches
         9.1.1         Original Program Listing (O)
         9.1.2         Calling Tree (C)
         9.1.3         KAP Switches (K)
         9.1.4         Loop Table (L)
         9.1.5         Name (N)
         9.1.6         Compilation Performance Statistics (P)
         9.1.7         Summary Table (S)
         9.1.8         Transformed Program Listing (T)
     9.2     Listing Information
         9.2.1         Line Numbers
         9.2.2         DO Loop Markings
         9.2.3         INCLUDE File Markings
         9.2.4         Footnotes
         9.2.5         Syntax Error/Warning Messages
         9.2.6         Questions Generated by KAP
         9.2.7         Action Summary
     9.3     Loop Table Messages
     9.4     KAP Listing Messages
Appendix A
Appendix A Compaq Fortran Extensions Supported by KAP Fortran/OpenMP
Appendix B
Appendix B Data Dependence Analysis
     B.1     Data Dependence Definitions
     B.2     Varieties of Data Dependence
     B.3     Input and Output Sets
     B.4     Data Dependence Relations
     B.5     Data Dependence Direction Vectors
     B.6     Loop-Carried Dependence
     B.7     Data Dependence Examples
Appendix C
Appendix C OpenMP Examples
     C.1     DO: A Simple Difference Operator
     C.2     DO: Two Difference Operators
     C.3     DO: Reduce Fork/Join Overhead
     C.4     SECTIONS: Two Difference Operators
     C.5     SINGLE: Updating a Shared Scalar
     C.6     SECTIONS: Updating a Shared Scalar
     C.7     DO: Updating a Shared Scalar
     C.8     PARALLEL DO: A Simple Difference Operator
     C.9     PARALLEL SECTIONS: Two Difference Operators
     C.10     Simple Reduction
     C.11     TASKCOMMON: Private Common
     C.12     THREADPRIVATE: Private Common and Master Thread
     C.13     INSTANCE PARALLEL: As a Private Common
     C.14     INSTANCE PARALLEL: As a Shared and then a Private Common
     C.15     Avoiding External Routines: Reduction
     C.16     Avoiding External Routines: Temporary Storage
     C.17     FIRSTPRIVATE: Copying in Initialization Values
     C.18     THREADPRIVATE: Copying in Initialization Values
     C.19     INSTANCE PARALLEL: Copying in Initialization Values
Appendix D
Appendix D PCF Directives
     D.1     PARALLEL REGION Directive
     D.2     PARALLEL DO Directive
     D.3     DO Loop Example with PCF Directives
     D.4     Program Example with PCF Directives
     D.5     CRITICAL SECTION Directive
     D.6     ONE PROCESSOR SECTION Directive
     D.7     Comparison of KAP PCF and Cray Autotasking Directives
Appendix E
Appendix E KAP and Incorrect Programs
Appendix F
Appendix F Listing File Messages
     F.1     Classes of Messages
     F.2     Messages
         F.2.1         Data Dependence (DD)
         F.2.2         Error (E)
         F.2.3         Extension (EX)
         F.2.4         Inlining/IPA (INL)
         F.2.5         Informational (INF)
         F.2.6         Inserted (I)
         F.2.7         Loop Reordering (LR)
         F.2.8         Warning (MIS)
         F.2.9         Option Error (OW)
         F.2.10         Not Optimized (NO)
         F.2.11         Output Translation (OT)
         F.2.12         Output Trans Fails (OTF)
         F.2.13         Program Too Large (NO)
         F.2.14         Question (Q)
         F.2.15         Scalar Optimization (SO)
         F.2.16         Standardized (STD)
         F.2.17         Translator Error (TE)
         F.2.18         Vector Enhanced (VE)
         F.2.19         Warning (W)
Index Index
Tables
2-1 kf90 Assumed Source Format Based on Switch Settings and File Extensions
2-2 User Actions for Specific Goals
3-1 OpenMP Directives Correlated to Cray CMIC Parallel Directives
4-1 Command-Line Switches for the kf90 Driver
4-2 Command-Line Switches for the kapf90 Translator
5-1 KAP Directives
6-1 KAP Assertions
A-1 Compaq Fortran Extensions Supported by KAP
D-1 KAP PCF and Cray Autotasking Directives


Previous Next Index