COMPAQ Software Product Description ___________________________________________________________________ PRODUCT NAME: Compaq Fortran Version 5.5 for Tru64 UNIX Alpha Systems SPD 37.54.15 DESCRIPTION This is the Software Product Description (SPD) for Compaq[TM] Fortran Version 5.5 for Tru64[TM] UNIX[TM] Alpha Systems. Compaq Fortran con- tains both the Compaq Fortran 95/90 Version 5.5 software and the Com- paq Fortran 77 Version 5.5 software as well as the Compaq Extended Math Library (CXML). In the following description, Compaq Fortran refers to Compaq Fortran 95/90 unless a specific reference to the 95/90 or 77 product is needed to distinguish between the two software products. Compaq Fortran is an implementation of the Fortran programming lan- guage that supports the FORTRAN 66, FORTRAN 77, Fortran 90, and For- tran 95 standards. Compaq Fortran 95/90 and Compaq Fortran 77 fully support the following standards: o ANSI X3.9-1966 (FORTRAN 66) o ANSI X3.9-1978 (FORTRAN 77) o ISO 1539-1980(E) (FORTRAN 77) o MIL-STD-1753 o FIPS-69-1 (Compaq Fortran meets the requirements of this standard by conforming to the ANSI Standard and by including a flagger. The flagger optionally produces diagnostic messages for compile-time elements that do not conform to the Full-Level ANSI Fortran Stan- dard.) January 2002 Compaq Fortran Version 5.5 for Tru64 UNIX Alpha Systems SPD 37.54.15 Compaq Fortran 95/90 supports all of the standards that Compaq For- tran 77 supports plus the following new standards: o ANSI X3.198-1992 (Fortran 90) o ISO/IEC 1539-1:1997(E) (Fortran 95) COMPAQ FORTRAN Compaq Fortran 95/90 fully supports the multivendor OpenMP[TM] For- tran Version 1.1 Specification, including support for directed par- allel processing using OpenMP directives on shared memory multipro- cessor systems. Compaq Fortran 95/90 supports most statement, library, and directive extensions from the HPF-2 specification (High Performance FORTRAN Lan- guage Specification, Version 2.0, January 31, 1997). Compaq Fortran supports extensions to the ISO and ANSI standards, in- cluding a number of extensions defined by Compaq Fortran for the var- ious Compaq Fortran platforms (operating system/architecture pairs). In addition to Compaq Tru64 UNIX Alpha systems, Compaq Fortran plat- forms include: o Compaq Visual Fortran for Intel[TM] systems under Windows[TM] NT[TM], Windows 2000, Windows Me, Windows 98, and Windows 95. o Compaq Fortran for Linux[TM] Alpha systems o Compaq Fortran and Compaq Fortran 77 for OpenVMS[TM] Alpha systems o Compaq Fortran 77 for OpenVMS VAX[TM] systems Major additions to the FORTRAN 77 standard introduced by the Fortran 90 standard include: o Array operations o Improved facilities for numeric computation o Parameterized intrinsic data types o User-defined data types 2 Compaq Fortran Version 5.5 for Tru64 UNIX Alpha Systems SPD 37.54.15 o Facilities for modular data and procedure definitions o Pointers o The concept of language evolution o Support for DATE_AND_TIME intrinsic for obtaining dates using a four- digit year format Compaq Fortran contains full support for the Fortran 95 standard, in- cluding the following features: o FORALL statement and construct o Automatic deallocation of ALLOCATABLE arrays o DIM argument to MAXLOC and MINLOC o PURE user-defined subprograms o ELEMENTAL user-defined subprograms (a restricted form of a pure pro- cedure) o Pointer initialization (initial value) o The NULL intrinsic to nullify a pointer o Derived-type structure initialization o CPU_TIME intrinsic subroutine o KIND argument to CEILING and FLOOR intrinsics o Nested WHERE constructs, masked ELSEWHERE statement, and named WHERE constructs o Comments allowed in namelist input o Generic identifier in END INTERFACE statements o Minimal width field editing using a numeric edit descriptor with 0 width o Detection of Obsolescent and/or Deleted features listed in the For- tran 95 standard. Compaq Fortran flags these obsolescent and deleted features, but fully supports them. 3 Compaq Fortran Version 5.5 for Tru64 UNIX Alpha Systems SPD 37.54.15 Compaq Fortran includes the following features and enhancements: o Full support for 64-bit address space, including 64-bit static space o Support for linking against static and shared libraries o Support for creating shareable code to be put into a shared library o Support for stack-based storage o Support for dynamic memory allocation o Support for reading and writing binary data files in nonnative for- mats, including IEEE[TM] (little-endian and big-endian), VAX, IBM[TM] System\360, and CRAY[TM] integer and floating point formats o User control over IEEE floating point exception handling, report- ing, and resulting values o Control for memory boundary alignment of items in COMMON and fields in structures and warnings for unaligned data o Directives to control listing page titles and subtitles, object file identification field, COMMON and record field alignment, and some attributes of COMMON blocks o Ability to CALL an external function subprogram o 7200 Character Statement Length o Free form unlimited line length o Mixing Subroutines/Functions in Generic Interfaces o Composite data declarations using STRUCTURE, END STRUCTURE, and RECORD statements, and access to record components through field refer- ences o Explicit specification of storage allocation units for data types such as: INTEGER*4 LOGICAL*4 REAL*4 REAL*8 COMPLEX*8 4 Compaq Fortran Version 5.5 for Tru64 UNIX Alpha Systems SPD 37.54.15 o Support for 64-bit signed integers using INTEGER*8 and LOGICAL*8 o Support for 128-bit floating-point real numbers (reals) using REAL*16 and COMPLEX*32 o A set of data types: - BYTE - LOGICAL*1, LOGICAL*2, LOGICAL*4, LOGICAL*8 - INTEGER*1, INTEGER*2, INTEGER*4, INTEGER*8 - REAL*4, REAL*8, REAL*16 - COMPLEX*8, COMPLEX*16, DOUBLE COMPLEX, COMPLEX*32 - POINTER (CRAY style) o Data statement style initialization in type declaration statements o AUTOMATIC and STATIC statements o Bit constants to initialize LOGICAL, REAL, and INTEGER values and participate in arithmetic and logical expressions o Built-in functions %LOC, %REF, and %VAL o VOLATILE statement o Bit manipulation functions o Binary, hexadecimal, and octal constants and Z and O format edit descriptors applicable to all data types o I/O unit numbers that can be any nonnegative INTEGER*4 value o Variable amounts of data can be read from and written to "STREAM" files, which contain no record delimiters o ENCODE and DECODE statements o ACCEPT, TYPE, and REWRITE input/output statements o DEFINE FILE, UNLOCK, and DELETE statements o USEROPEN subroutine invocation at file OPEN time 5 Compaq Fortran Version 5.5 for Tru64 UNIX Alpha Systems SPD 37.54.15 o Support for I/O record larger than 2.1 Gigabytes in variable-length unformatted files o Support for reading nondelimited character strings as input for char- acter NAMELIST items o Debug statements in source o Generation of a source listing file with optional machine code rep- resentation of the executable source o Generation of a symbolic assembly language representation of the executable source that can be assembled o Variable format expressions in a FORMAT statement o Optional run-time bounds checking of array subscripts and charac- ter substrings o 31-character identifiers that can include dollar sign ($) and un- derscore (_) o Support for executing in-line assembler code using the ASM intrin- sics o Support for the supercomputer intrinsics POPCNT, POPPAR, LEADZ, TRAILZ, and MULT_HIGH o Language elements that support the various extended range and ex- tended precision floating point architectural features: - 32-bit IEEE S_floating data type, with an 8-bit exponent and 24- bit mantissa, which provides a range of 1.17549435E-38 (normal- ized) to 3.40282347E38 (the IEEE denormalized limit is 1.40129846E- 45) and a precision of typically 7 decimal digits - 64-bit IEEE T_floating data type, with an 11-bit exponent and 53-bit mantissa, which provides a range of 2.2250738585072013D- 308 (normalized) to 1.7976931348623158D308 (the IEEE denormal- ized limit is 4.94065645841246544D-324) and a precision of typ- ically 15 decimal digits 6 Compaq Fortran Version 5.5 for Tru64 UNIX Alpha Systems SPD 37.54.15 - 128-bit IEEE extended Alpha X_floating data type, with a 15-bit exponent and a 113-bit mantissa, which provides a range of ap- proximately 6.48Q-4966 to 1.18Q4932 and a precision of typically 33 decimal digits o Command line control for: - The size of default INTEGER, REAL, and DOUBLE PRECISION data items - The levels and types of optimization to be applied to the pro- gram - The directories to search for INCLUDE files - Inclusion or suppression of various compile-time warnings - Inclusion or suppression of run-time checking for various I/O and computational errors - Control over whether compilation terminates after a specific num- ber of errors has been found - Choosing whether executing code will be thread-reentrant o Internal procedures can be passed as actual arguments to procedures o Kind types for all of the hardware-supported data types: - For 1-, 2-, 4-, and 8-byte LOGICAL data: LOGICAL (KIND=1) LOGICAL (KIND=2) LOGICAL (KIND=4) LOGICAL (KIND=8) - For 1-, 2-, 4-, and 8-byte INTEGER data: INTEGER (KIND=1) INTEGER (KIND=2) INTEGER (KIND=4) INTEGER (KIND=8) - For 4-, 8-, and 16-byte REAL data: REAL (KIND=4) REAL (KIND=8) 7 Compaq Fortran Version 5.5 for Tru64 UNIX Alpha Systems SPD 37.54.15 REAL (KIND=16) - For single precision, double precision, and quad-precision COM- PLEX data: COMPLEX (KIND=4) COMPLEX (KIND=8) COMPLEX (KIND=16) o The following features found in Compaq Visual Fortran: - # Constants-constants using other than base 10 - C Strings-NULL terminated strings - Conditional Compilation And Metacommand Expressions ($define, $undefine, $if, $elseif, $else, $endif) - $FREEFORM, $NOFREEFORM, $FIXEDFORM-source file format - $INTEGER, $REAL-selects size - $FIXEDFORMLINESIZE-line length for fixed form source - $STRICT, $NOSTRICT-F90 conformance - $PACK-structure packing - $ATTRIBUTES ALIAS-external name for a subprogram or common block - $ATTRIBUTES C, STDCALL-calling and naming conventions - $ATTRIBUTES VALUE, REFERENCE-calling conventions - \ Descriptor-prevents writing an end-of-record mark - Ew.dDe and Gw.dDe Edit Descriptors-similar to Ew.dEe and Gw.dEe - $DECLARE and $NODECLARE (same as IMPLICIT NONE) - $ATTRIBUTES EXTERN-variable allocated in another source file - $ATTRIBUTES VARYING-variable number of arguments - $ATTRIBUTES ALLOCATABLE-allocatable array - Mixing Subroutines/Functions in Generic Interfaces - $MESSAGE-output message during compilation 8 Compaq Fortran Version 5.5 for Tru64 UNIX Alpha Systems SPD 37.54.15 - $LINE (same as C's #line) - INT1 converts to one byte integer by truncating - INT2 converts to two byte integer by truncating - INT4 converts to four byte integer by truncating - COTAN returns cotangent - DCOTAN returns double precision cotangent - IMAG returns the imaginary part of complex number - IBCHNG reverses value of bit - ISHA shifts arithmetically left or right - ISHC performs a circular shift - ISHL shifts logically left or right o Support for directed decomposition for parallel processing on shared memory multiprocessor systems using source code directives from ei- ther OpenMP (!$OMP) or Compaq Fortran (!$PAR): - PARALLEL and END PARALLEL directives to define parallel regions - DO and END DO directives to define parallel work constructs - PARALLEL and SECTIONS directives to define parallel work con- structs - PRIVATE and SHARED attributes to describe data local or global to the threads of execution - CRITICAL section directive to define a guarded region where one thread executes at a time - TASK COMMON or THREADPRIVATE directives to allow each thread to have a local copy of a COMMON block - Environment variables to control resource utilization at run- time - Library routines to query and adjust the run-time parallel en- vironment 9 Compaq Fortran Version 5.5 for Tru64 UNIX Alpha Systems SPD 37.54.15 - Nested OpenMP parallel regions - NUM_THREADS clause o A number of High Performance Fortran (HPF) features, including: - The data parallel FORALL statement and construct - Execution model procedure prefixes: EXTRINSIC(HPF) EXTRINSIC(HPF_LOCAL) EXTRINSIC(HPF_SERIAL) - PURE procedure prefix to specify a lack of procedure side ef- fects - The following HPF data alignment and distribution directives: ALIGN DISTRIBUTE INHERIT PROCESSORS TEMPLATE SHADOW ON (in conjunction with INDEPENDENT loops) - Many HPF-2 approved extensions, including: * HPF_LOCAL routines, and all HPF_LOCAL_LIBRARY routines ex- cept LOCAL_BLKCNT, LOCAL_LINDEX, and LOCAL_LINDEX but none of the approved extensions to HPF_LOCAL_LIBRARY routines * HPF_SERIAL routines * ON directive within INDEPENDENT loops * RESIDENT directive used in conjunction with INDEPENDENT loops * Mapping of derived type components * Pointers to mapped objects * Shadow width declarations 10 Compaq Fortran Version 5.5 for Tru64 UNIX Alpha Systems SPD 37.54.15 - HPF intrinsic procedures and library routines: * NUMBER_OF_PROCESSORS and PROCESSORS_SHAPE * Reduction functions * Combining scatter functions * Parallel prefix and suffix functions * Sorting functions * System inquiry intrinsics * Computational intrinsics * Mapping inquiry subroutines * Bit manipulation functions * Array reduction functions * Array combining scatter functions * Array parallel prefix and suffix functions * Array sorting functions GRADE_UP, GRADE_DOWN - HPF INDEPENDENT directive - HPF SEQUENCE and NOSEQUENCE directives Compaq Fortran 77 contains the following extensions to the FORTRAN 77 standard: o Support for recursive subprograms o IMPLICIT NONE statements o INCLUDE statement o NAMELIST-directed I/O o DO WHILE and ENDDO statements o Use of exclamation point (!) for end of line comments o Generation of Cross Reference Listings 11 Compaq Fortran Version 5.5 for Tru64 UNIX Alpha Systems SPD 37.54.15 o Support for NTT Technical Requirement TR550001, Multivendor Inte- gration Architecture (MIA) Version 1.1, Division 2, Part 3-2, Pro- gramming Language FORTRAN o Support for automatic arrays o Support for the SELECT CASE - CASE - CASE DEFAULT - END SELECT state- ments o Support for the EXIT and CYCLE statements and for construct names on DO - END DO statements o Reporting of unused and uninitialized variables o Support for DATE_AND_TIME intrinsic for obtaining dates using a four- digit year format Compaq Fortran provides a multiphase optimizer that is capable of per- forming optimizations across entire programs. Specific optimizations performed by both Compaq Fortran 95/90 and Compaq Fortran 77 include: o Constant folding o Optimizations of arithmetic IF, logical IF, and block IF-THEN-ELSE o Global common subexpression elimination o Removal of invariant expressions from loops o Global allocation of general registers across program units o In-line expansion of statement functions and routines o Optimization of array addressing in loops o Value propagation o Deletion of redundant and unreachable code o Loop unrolling o Thorough dependence analysis o Software pipelining to rearrange instructions between different un- rolled loop iterations o Optimized interface to intrinsic functions 12 Compaq Fortran Version 5.5 for Tru64 UNIX Alpha Systems SPD 37.54.15 o Loop transformation optimizations that apply to array references within loops, including: - Loop blocking - Loop distribution - Loop fusion - Loop interchange - Loop scalar replacement - Outer loop unrolling o Speculative code scheduling, where a conditionally executed instruc- tion is moved to a position before a test instruction and executed unconditionally. This reduces instruction latency stalls. (Perfor- mance may be reduced somewhat, because the run-time system must dis- miss exceptions caused by speculative instructions.) Specific optimizations performed by Compaq Fortran 95/90 include: o Array temporary elimination o A number of HPF-specific optimizations, including: - Message vectorization - Nearest-neighbor optimizations for improved communication per- formance of constructs typically seen in PDE solvers - Parallelism of reductions - Run-time preprocessing of loops for improved performance of ir- regular data access code - Many other communication-based optimizations Both Compaq Fortran 95/90 and Compaq Fortran 77 are shareable, re-entrant compilers that operate under the Compaq Tru64 UNIX operating system. They globally optimize source programs while taking advantage of the native instruction set and the Compaq Tru64 UNIX virtual memory sys- tem. 13 Compaq Fortran Version 5.5 for Tru64 UNIX Alpha Systems SPD 37.54.15 COMPAQ EXTENDED MATH LIBRARY (CXML) Compaq Extended Math Library (CXML) is a set of mathematical subpro- grams that are optimized for Compaq architectures. Included subpro- grams cover the areas of: o Basic Linear Algebra o Linear System and Eigenproblem Solvers o Sparse Linear System Solvers o Sorting o Random Number Generation o Signal Processing The Basic Linear Algebra library includes the industry-standard Ba- sic Linear Algebra Subprograms (BLAS) Level 1, Level 2, and Level 3. Also included are subprograms for BLAS Level 1 Extensions, Sparse BLAS Level 1, and Array Math Functions (VLIB). The Linear System and Eigenproblem Solver library provides the com- plete LAPACK v2 package developed by a consortium of university and government laboratories. LAPACK is an industry-standard subprogram pack- age offering an extensive set of linear system and eigenproblem solvers. LAPACK uses blocked algorithms that are better suited to most modern architectures, particularly ones with memory hierarchies. LAPACK will supersede LINPACK and EISPACK for most users. The Sparse Linear System library provides both direct and iterative sparse linear system solvers. The direct solver package supports both symmetric and nonsymmetric sparse matrices stored using the skyline storage scheme. The iterative solver package contains a basic set of storage schemes, preconditioners, and iterative solvers. The design of this package is modular and matrix-free, allowing future expansion and easy modification by users. 14 Compaq Fortran Version 5.5 for Tru64 UNIX Alpha Systems SPD 37.54.15 The Signal Processing library provides a basic set of signal process- ing functions. Included are one-, two-, and three-dimensional Fast Fourier Transforms (FFT), group FFTs, Cosine/Sine Transforms (FCT/FST), Con- volution, Correlation, and Digital Filters. Many CXML subprograms are optimized for the supported hardware plat- forms. Optimization techniques include traditional optimizations such as loop unrolling and loop reordering. CXML subprograms also provide efficient management of the hierarchical memory system, using tech- niques such as the following: o Reuse of data within registers to minimize memory accesses o Efficient cache management o Use of blocked algorithms that minimize translation buffer misses and unnecessary paging Since CXML routines can be called from all languages that support the Compaq Tru64 UNIX calling standard, the library provides optimized com- putation for applications written in these languages. Where appropri- ate, most subprograms are available in both real and complex versions, as well as in both single and double precision. CXML for Compaq Tru64 Alpha supports IEEE floating-point format. Parallel Library Support for Symmetric Multiprocessing CXML also supports symmetric multiprocessing (SMP) for improved per- formance. Key BLAS Level 2 and 3 routines have been modified to ex- ecute in parallel if run on SMP hardware. Also modified for this pur- pose are: o LAPACK GETRF and POTRF routines o Sparse iterative solvers o Skyline solvers o FFT routines 15 Compaq Fortran Version 5.5 for Tru64 UNIX Alpha Systems SPD 37.54.15 These parallel routines along with the other serial routines are sup- plied in an alternative library. The user can choose to link with ei- ther the parallel or the serial library, depending on whether SMP sup- port is required, since each library contains the complete set of rou- tines. Basic Linear Algebra Subprograms Linear algebra operations are fundamental to many mathematical appli- cations, and several libraries of linear algebra subprograms exist through- out the computer industry. The CXML BLAS library contains the most com- monly used linear algebra subprograms. The CXML linear algebra library contains five groups of subprograms at three levels: o Basic Linear Algebra Subprograms (BLAS) Level 1 o BLAS Level 1 Extensions o BLAS Level 1 Sparse Extensions o BLAS Level 2 o BLAS Level 3 BLAS Level 1 (Scalar/Vector and Vector/Vector Operations) BLAS Level 1 provides a set of elementary vector functions, operat- ing on one or two vectors. These are typically very small routines, and they make less efficient use of the computing resources of mod- ern computer architectures than the Level 2 and 3 operations. CXML provides the 15 standard BLAS Level 1 operations: o The index of the element of a vector having maximum absolute value o The sum of the absolute values of the elements of a vector o Inner product of two real vectors 16 Compaq Fortran Version 5.5 for Tru64 UNIX Alpha Systems SPD 37.54.15 o Scalar plus the extended precision inner product of two real vec- tors o Conjugated inner product of two complex vectors o Unconjugated inner product of two complex vectors o Square root of the sum of squares (norm) of the elements of a vec- tor o Scalar times a vector plus a vector o Copy one vector to another o Apply a Givens rotation o Apply a modified Givens plane rotation o Generate elements for a Givens plane rotation o Generate elements for a modified Givens plane rotation o Product of a vector times a scalar o Swap the elements of two vectors BLAS Level 1 Extensions (Vector/Vector Operations) When developing mathematical algorithms using the BLAS Level 1, sci- entists and engineers found that several additional constructs were used on a regular basis. These constructs are well known throughout the computer industry as BLAS Level 1 Extensions. CXML contains 13 BLAS Level 1 Extension operations: o Index of element having the minimum absolute value o Index of element having the maximum value o Index of element having the minimum value o Largest value of the elements of a vector o Smallest value of the elements of a vector o Largest absolute value of the elements of a vector 17 Compaq Fortran Version 5.5 for Tru64 UNIX Alpha Systems SPD 37.54.15 o Smallest absolute value of the elements of a vector o Sum of the values of the elements of a vector o Set all elements of a vector equal to a scalar o Constant times a vector set to another vector (y = a x) o Euclidean norm with no intermediate scaling o Sum of the squares of the elements of a vector o Constant times a vector plus a vector set to another vector (z = a x + y) BLAS Level 1 Sparse Extensions (Vector/Vector Operations) This group of operations is similar to the BLAS Level 1 routines, but is designed to work on sparse vectors (vectors in which most of the elements are zero). Six of the routines are from industry standard Sparse BLAS 1, and the remaining three are enhancements. The nine sparse BLAS Level 1 operations are: o Scalar times a sparse vector plus a vector o Sum of a sparse vector and a full vector o Inner product of a sparse vector and a full vector o Gather a sparse vector from a full vector o Gather a sparse vector from the scaled elements of a full vector o Gather a sparse vector from a full vector and zero corresponding elements of full vector o Apply Givens rotation to a sparse vector and a full vector o Scatter a sparse vector into a full vector o Scale and scatter a sparse vector into a full vector 18 Compaq Fortran Version 5.5 for Tru64 UNIX Alpha Systems SPD 37.54.15 BLAS Level 2 (Matrix/Vector Operations) The BLAS Level 2 codes make more effective use of the data in the reg- isters, reducing the number of register loads and stores required. In addition, loop unrolling techniques are used to minimize cache misses and page faults. The BLAS Level 2 subprograms use the following types of operations: o Matrix/vector products o Rank-1 and rank-2 matrix updates o Solutions of triangular systems of equations Six types of matrices are supported by these BLAS Level 2 routines: o General o General band o Symmetric/Hermitian o Symmetric/Hermitian band o Triangular o Triangular band BLAS Level 3 (Matrix/Matrix Operations) The BLAS Level 3 routines operate at a level that makes the most ef- ficient use of machine resources. CXML optimizes these routines by par- titioning matrices into blocks and computing matrix/matrix operations on each block. This approach avoids excessive memory accesses by pro- viding full reuse of data while each block is in the cache or the reg- isters. BLAS Level 3 routines provide this kind of blocking for three basic types of operations: o Matrix/matrix products o Rank-k and rank-2k updates of a symmetric matrix 19 Compaq Fortran Version 5.5 for Tru64 UNIX Alpha Systems SPD 37.54.15 o Solving triangular systems of equations with multiple right-hand sides Three types of matrices are supported by these BLAS Level 3 routines: o General o Symmetric/Hermitian o Triangular A set of additional matrix-matrix routines is provided: o Add two matrices o Subtract one matrix from another o Transpose a matrix, in-place or out-of-place Array Math Functions The Array Math Functions provide a set of basic math functions that operate on arrays of numbers rather than on scalars. On vector and su- perscalar architectures, such functions have a performance advantage over a loop of scalar operations. The library includes the following array functions for double precision numbers: o Sine of array o Cosine of array o Cosine and sine of array o Exponent of array o Logarithm of array o Square root of array o Reciprocal of array 20 Compaq Fortran Version 5.5 for Tru64 UNIX Alpha Systems SPD 37.54.15 LAPACK Library Contents LAPACK is a library of linear algebra subprograms intended to solve a wide range of problems in linear algebra. LAPACK can be used to solve dense systems of linear equations, linear least squares problems, eigen- value problems, and singular value problems. It is also useful in do- ing other computations such as matrix factorizations and estimations of condition numbers. The CXML LAPACK library provides the complete LAPACK v2 package. CXML's version of LAPACK is provided as a packaged library, compiled, tested, and ready to use. Combined with the optimized BLAS Level 3 routines, the CXML LAPACK will provide optimal performance on all supported plat- forms. LAPACK should be used in place of LINPACK and EISPACK, because it is more efficient, accurate, and robust. LAPACK supports both real and complex, single and double precision data. It operates on the following types of matrices: o Bidiagonal o General band o General unsymmetric o General tridiagonal o Hermitian o Hermitian, packed storage o Upper Hessenberg, generalized problem o Upper Hessenberg o Orthogonal o Orthogonal, packed storage o Symmetric/Hermitian positive definite band o Symmetric/Hermitian positive definite o Symmetric/Hermitian positive definite, packed storage o Symmetric/Hermitian positive definite tridiagonal 21 Compaq Fortran Version 5.5 for Tru64 UNIX Alpha Systems SPD 37.54.15 o Symmetric band o Symmetric, packed storage o Symmetric tridiagonal o Symmetric o Triangular band o Triangular, generalized problem o Triangular, packed storage o Triangular o Trapezoidal o Unitary o Unitary, packed storage LAPACK provides the following operations: o Triangular factorization o Unblocked triangular factorization o Solve a system of linear equations (based on triangular factoriza- tion) o Compute the inverse (based on triangular factorization) o Compute a split Cholesky factorization of a symmetric/Hermitian pos- itive definite band matrix o Unblocked computation of inverse o Estimate condition number o Refine initial solution returned by solver o Perform QR factorization without pivoting o Unblocked QR factorization o Solve linear least squares problem (based on QR factorization) o Solve the linear equality constrained least squares (LSE) problem 22 Compaq Fortran Version 5.5 for Tru64 UNIX Alpha Systems SPD 37.54.15 o Solve the Gauss-Markov linear model problem o Perform LQ factorization without pivoting o Unblocked LQ factorization o Solve underdetermined linear system (based on LQ factorization) o Generate a real orthogonal or complex unitary matrix as a product of Householder matrices o Unblocked generation of real orthogonal or unitary matrix o Multiply a matrix by a real orthogonal or complex unitary matrix by applying a product of Householder matrices o Unblocked version of multiplication of a matrix by a real orthog- onal or complex unitary matrix by applying a product of Householder matrices o Reduce a square matrix to upper Hessenberg form o Unblocked version of square matrix reduction o Reduce a symmetric matrix to real symmetric tridiagonal form o Reduce a band matrix to bidiagonal form o Unblocked version of symmetric matrix reduction o Reduce a rectangular matrix to bidiagonal form o Reduce a band symmetric/Hermitian matrix to tridiagonal form o Reduce a symmetric/Hermitian-definite banded generalized eigenprob- lem to standard form o Compute various norms of a complex Hermitian tridiagonal matrix o Compute eigenvalues and optional Schur factorization or eigenvec- tors using QR algorithm o Compute selected eigenvectors by inverse iteration o Compute eigenvectors from Schur factorization o Compute eigenvectors using the Pal-Walker-Kahan variant of the QL or QR algorithm 23 Compaq Fortran Version 5.5 for Tru64 UNIX Alpha Systems SPD 37.54.15 o For a pair of N-by-N real nonsymmetric matrices, compute the gen- eralized eigenvalues, the real Schur form, and the left and/or right Schur vectors o For a pair of N-by-N real nonsymmetric matrices, compute the gen- eralized eigenvalues, and the left and/or right generalized eigen- vectors o Solve the generalized nonsymmetric eigenproblem Ax = lambda Bx o Solve the generalized definite banded eigenproblem Ax = lambda Bx o Solve the generalized symmetric/Hermitian-definite banded eigen- problem o Solve the symmetric eigenproblem using divide-and-conquer algorithm o Compute singular values and, optionally, singular vectors using the QR algorithm o Compute the generalized (quotient) singular value decomposition o Compute the generalized singular value decomposition (GSVD) on the M-by-N matrix A and P-by-N matrix B o Solve a generalized linear regression model problem Sparse System Solver Subprograms The CXML Sparse System Solver library contains a set of subprograms that can be used to solve sparse linear systems of equations. Two pack- ages providing direct and iterative methods are supported. Direct Method Sparse Solver Package: The direct solver package includes skyline (profile) solvers for sym- metric and nonsymmetric matrices. Separate factorization and solver routines are provided to allow repeated use of the solver for multi- ple right hand sides, without repeating the factorization. To make the subprograms easier to use, both simple and expert driver routines are provided. Functions provided include: o LDU factorization 24 Compaq Fortran Version 5.5 for Tru64 UNIX Alpha Systems SPD 37.54.15 o Solve o Norm evaluation o Condition number estimation o Iterative refinement o Simple and expert drivers These storage schemes are supported for symmetric and nonsymmetric ma- trices: o Profile-in storage o Structurally symmetric, profile-in storage (for nonsymmetric only) o Diagonal-out storage Iterative Method Sparse Solver Package: For the iterative method, the library provides a modular set of stor- age schemes, preconditioners, and solvers. These solvers and precon- ditioners are easily accessed through an integrated driver routine. Six iterative sparse solvers for real, double precision data are sup- plied: o Preconditioned conjugate gradient method o Preconditioned least squares conjugate gradient method o Preconditioned biconjugate method o Preconditioned conjugate gradient squared method o Preconditioned generalized minimum residual method o Preconditioned transpose free QMR method Routines for three storage schemes are provided, or the user can de- velop routines to employ a custom storage scheme. The supplied stor- age schemes include: o Symmetric diagonal 25 Compaq Fortran Version 5.5 for Tru64 UNIX Alpha Systems SPD 37.54.15 o Unsymmetric diagonal o General storage by rows Three preconditioners are supplied, which can be selectively applied to the data. Users can also supply custom preconditioners. The pre- conditioners supplied include: o Diagonal o Polynomial (Neumann) o Incomplete LU with zero diagonals added Sorting Subprograms Two sort subprograms using the Quicksort algorithm and two general pur- pose radix sort subprograms are provided, as follows: o Sort elements of a vector using the Quicksort algorithm o Sort an indexed vector of data using the Quicksort algorithm o Sort data using a radix sort algorithm o Sort an indexed vector of data using a radix sort algorithm All of the above sorts operate on data stored in memory. Random Number Subprograms CXML provides four random number generator subprograms: o Produce a vector of uniform [0,1], long-period random numbers us- ing the L'Ecuyer multiplicative method. Two auxiliary input rou- tines are provided to allow this subprogram to be called from within a parallel section of a program. o Produce a vector of N(0,1), normally-distributed random numbers. Two auxiliary input routines are provided to allow this subprogram to be called from within a parallel section of a program. 26 Compaq Fortran Version 5.5 for Tru64 UNIX Alpha Systems SPD 37.54.15 o Produce single precision random numbers using a linear multiplica- tive algorithm o Produce single precision random numbers using a Lehmer multiplica- tive generator Signal Processing Subprograms The CXML Signal Processing library contains a set of subprograms in four basic areas of signal processing: o Fast Fourier Transforms (FFT) o Fast Cosine and Fast Sine Transforms (FCT and FST) o Convolution and correlation o Digital filters Fast Fourier Transforms and Cosine and Sine Transforms CXML provides one-dimensional, two-dimensional, three-dimensional, and group FFT routines and one-dimensional FCT/FST routines. Each routine is supplied in two forms: o The first form computes the transform in one unit operation. This is convenient for programs requiring speed on only one or a few op- erations. o The second form is provided for programs requiring speed on repeated operations. With this form, each routine is subdivided into three routines. One routine builds the rotation factors, a second rou- tine applies them to perform the transform, and a third routine deal- locates any virtual memory allocated in the first routine. Thus, for repeated operations, the rotation factors need to be built only once. 27 Compaq Fortran Version 5.5 for Tru64 UNIX Alpha Systems SPD 37.54.15 Convolution and Correlation CXML provides routines for computing one-dimensional discrete convo- lutions and correlations. These routines can process both periodic and nonperiodic data. Digital Filters CXML provides support for one-dimensional, nonrecursive digital fil- tering. Based on the Kaisers Sinh-Bessel algorithm, these routines al- low programming of bandpass, bandstop, low-pass, and high-pass fil- ters. Cray SciLib Portability Support SCIPORT is a Compaq Computer Corporation implementation of v7 of the Cray Research scientific numerical library, SciLib. SCIPORT provides 64 bit single-precision and 64-bit integer interfaces to underlying CXML routines for Cray users porting programs to Alpha systems run- ning Compaq Tru64 UNIX. SCIPORT also provides equivalent versions of almost all Cray Math Library and CF77 (Cray Fortran 77) Math intrin- sic routines. In order to be completely source code compatible with SciLib, the SCI- PORT library calling sequence supports 64-bit integers passed by ref- erence. However, internally, SCIPORT uses 32 bit integers. Consequently, some run-time uses of SciLib may not be supported by SCIPORT. SCIPORT provides the following: o 64-bit versions of all Cray SciLib single-precision BLAS Level 1, Level 2, and Level 3 routines o All Cray SciLib LAPACK routines o All Cray SciLib Special Linear System Solver routines o All Cray SciLib Signal Processing routines o All Cray SciLib Sorting and Searching routines 28 Compaq Fortran Version 5.5 for Tru64 UNIX Alpha Systems SPD 37.54.15 These routines are completely interchangeable with their Cray SciLib counterparts up to the runtime limit on integer size, and with the ex- ception of the ORDERS routine, require no program changes to function correctly. Owing to endian differences of machine architecture, spe- cial considerations must be given when the ORDERS routine is used to sort multibyte character strings. RUN-TIME LIBRARY REDISTRIBUTION The Compaq Fortran kit may include updated Run-Time Library shareable images. Compaq grants the user a nonexclusive royalty-free worldwide right to reproduce and distribute the executable version of the Run- Time Library (the "RTLs"), provided that the user does all of the fol- lowing: o Distributes the RTLs only in conjunction with and as a part of the user's software application product that is designed to operate in the Compaq Tru64 UNIX environment. o Does not use the name, logo, or trademarks of Compaq to market the user's software application product. o Includes the copyright notice of Compaq Fortran on the user's prod- uct disk label and/or on the title page of the documentation for software application product. o Agrees to indemnify, hold harmless, and defend Compaq from and against any claims or lawsuits, including attorney's fees, that arise or result from the use or distribution of the software application prod- uct. Except as expressly provided herein, Compaq grants no implied or ex- press license under any of its patents, copyrights, trade secrets, trade- marks, or any license or other proprietary interests and rights. For Compaq Tru64 UNIX Alpha systems, the RTL images are designated as: o libfor.a, libfor.so o libUfor.a, libUfor.so 29 Compaq Fortran Version 5.5 for Tru64 UNIX Alpha Systems SPD 37.54.15 o libFutil.a, libFutil.so o libshpf.a, libshpf.so (Compaq Fortran 95/90 only) o for_msg.cat o libcxml_ev4.a, libcxml_ev4.so o libcxml_ev5.a, libcxml_ev5.so o libcxml_ev6.a, libcxml_ev6.so o libcxmlp_ev4.a, libcxmlp_ev4.so (Compaq Fortran 95/90 only) o libcxmlp_ev5.a, libcxmlp_ev5.so (Compaq Fortran 95/90 only) o libcxmlp_ev6.a, libcxmlp_ev6.so (Compaq Fortran 95/90 only) HARDWARE REQUIREMENTS Processors Supported: Any Alpha system that is capable of running Compaq Tru64 UNIX Version 4.0 or newer. Disk Space Requirements: Disk space required for Compaq Fortran installation: Root file system: / 0 MB Other file systems: /usr 71 MB /tmp 1 MB /var 0 MB Disk space required for Compaq Fortran use (permanent): Root file system: /0 MB Other file systems: /usr 67 MB /var 0 MB To install the Compaq Extended Math Library (CXML), an additional 75 MB (or more) is needed in the /usr file system. 30 Compaq Fortran Version 5.5 for Tru64 UNIX Alpha Systems SPD 37.54.15 These sizes are approximate; actual sizes may vary depending on the user's system environment, configuration, and software options. Memory Requirements: For systems used to compile a program for parallel execution with the -wsf flag, a minimum of 64 MB of physical memory is recommended. SOFTWARE REQUIREMENTS o Compaq Tru64 UNIX Operating System Version 4.0, 5.0A, or 5.1 (SPD 41.61.xx) o Compaq Tru64 UNIX Developers' Toolkit Version 4.0, 5.0A, or 5.1 (SPD 44.36.xx) for Compaq Tru64 UNIX Version 4.0, 5.0A, or 5.1 respec- tively. SOFTWARE LICENSING INFORMATION This software is furnished only under license. For more information about licensing terms and policies of Compaq, contact your local Com- paq office. LICENSE MANAGEMENT FACILITY SUPPORT Compaq Fortran supports the License Management Facility of Compaq. License units for Compaq Fortran are allocated on an Unlimited Sys- tem Use plus Concurrent Use basis. Each Concurrent Use license allows any one individual at a time to use the layered product. For more information on the License Management Facility, refer to the Compaq Tru64 UNIX Operating System Software Product Description (SPD 41.61.xx) or to the Compaq Tru64 UNIX Operating System documentation set. 31 Compaq Fortran Version 5.5 for Tru64 UNIX Alpha Systems SPD 37.54.15 OPTIONAL SOFTWARE Compaq Parallel Software Environment (SPD 51.09.06) is required to link and execute High Performance Fortran programs compiled for parallel execution, using the -wsf compiler option, on multiple Alpha systems. The Compaq Parallel Software Environment is available only for Com- paq Alpha systems running Compaq Tru64 UNIX. Compaq KAP[TM] Fortran/OpenMP V4.2 for Compaq Tru64 UNIX (for Compaq Fortran 95/90 compiler; see SPD 45.73.xx) DIGITAL[TM] KAP for DIGITAL Fortran V3.2 for Compaq Tru64 UNIX (for Compaq Fortran 77 compiler) To use directed parallel processing (such as the OpenMP directives) requires Compaq Tru64 UNIX Version 4.0D or higher. GROWTH CONSIDERATIONS The minimum hardware/software requirements for any future version of this product may be different from the requirements for the current version. DISTRIBUTION MEDIA Media and documentation for Compaq Fortran are available on the Com- paq Software Product Library CD-ROM set for Compaq Tru64 UNIX Prod- ucts (QA-054AA-H8) or a CD-ROM containing only the Compaq Fortran for Compaq Tru64 UNIX Alpha Systems product (QA-MV2AA-H8). Documentation in printed format can be ordered separately (see the Compaq Fortran "read first" cover letter or the online release notes). SOFTWARE WARRANTY This software is provided by Compaq with a 90 day conformance warranty in accordance with the Compaq warranty terms applicable to the license purchase. 32 Compaq Fortran Version 5.5 for Tru64 UNIX Alpha Systems SPD 37.54.15 The above information is valid at time of release. Please contact your local Compaq office for the most up-to-date information. ORDERING INFORMATION Software Licenses: Unlimited System Use: QL-MV2A*-AA Personal Use: QL-MV2AM-2B Concurrent Use: QL-MV2AM-3B Concurrent 5 Pack: QL-MV2AM-3C Concurrent 10 Pack: QL-MV2AM-3D Software Documentation: Compaq Fortran 95/90 Documentation: QA-MV2AA-GZ Compaq Fortran 77 Documentation: QA-MV2AB-GZ Software Product Services: QT-MV2*-** * Denotes variant fields. For additional information on available li- censes, services, and media, refer to the appropriate price book. The above information is valid at time of release. Please contact your local Compaq office for the most up-to-date information. SOFTWARE PRODUCT SERVICES A variety of service options are available from Compaq. For more in- formation, contact your local Compaq office. The above information is valid at time of release. Please contact your local Compaq office for the most up-to-date information. 33 Compaq Fortran Version 5.5 for Tru64 UNIX Alpha Systems SPD 37.54.15 TRADEMARK INFORMATION Copyright © 2002 Compaq Information Technologies Group, L.P. Compaq, the COMPAQ logo, DIGITAL, OpenVMS, Tru64, VAX, and OpenVMS, Tru64, VAX, and VAX FORTRAN are trademarks of Compaq Information Tech- nologies Group, L.P. in the United States and/or other countries. Microsoft, NT, and Windows are trademarks or registered trademarks of Microsoft Corporation in the United States and/or other countries. Intel and KAP are trademarks or registered trademarks of Intel Cor- poration in the United States and/or other countries. CRAY is a registered trademark of Cray Research, Inc. in the United States and/or other countries; IBM is a registered trademark of In- ternational Business Machines Corporation in the United States and/or other countries; IEEE is a registered trademark of the Institute of Electrical and Electronics Engineers, Inc. in the United States and /or other countries; Linux is a registered trademark of Linus Torvalds in the United States and/or other countries; OpenMP is a trademark of the OpenMP Architecture Review Board in the United States and/or other countries; UNIX is a trademark of The Open Group in the United States and/or other countries. All other product names mentioned herein may be trademarks or regis- tered trademarks of their respective companies. Confidential computer software. Valid license from Compaq required for possession, use or copying. Consistent with FAR 12.211 and 12.212, Com- mercial Computer Software, Computer Software Documentation, and Tech- nical Data for Commercial Items are licensed to the U.S. Government under vendor's standard commercial license. Compaq shall not be liable for technical or editorial errors or omis- sions contained herein. The information in this document is provided "as is" without warranty of any kind and is subject to change with- out notice. The warranties for Compaq products are set forth in the express limited warranty statements accompanying such products. Noth- ing herein should be construed as constituting an additional warranty. 34