DIGITAL Software Product Description ___________________________________________________________________ PRODUCT NAME: Digital Extended Math Library SPD 41.84.04 Version 2.9 for OpenVMS Alpha DESCRIPTION Digital Extended Math Library (DXML) is a set of mathematical subrou- tines that are optimized for Digital architectures. Four libraries are included, covering the areas of Basic Linear Algebra, Linear System and Eigenproblem Solvers, Sparse Linear System Solvers, and Signal Processing. The Basic Linear Algebra Library includes the industry-standard Ba- sic Linear Algebra Subprograms (BLAS) Level 1, Level 2, and Level 3. Also included are subprograms for BLAS Level 1 Extensions, Sparse BLAS Level 1, and Array Math Functions. The Linear System and Eigenproblem Solver Library provides the com- plete LAPACK package developed by a consortium of university and gov- ernment laboratories. LAPACK is a new, industry-standard package of- fering an extensive set of linear system and eigenproblem solvers. LA- PACK uses blocked algorithms that are better suited to most modern ar- chitectures, particularly ones with memory hierarchies. LAPACK will supersede LINPACK and EISPACK for most users. The Sparse Linear System Library provides both direct and iterative sparse linear system solvers. The direct solver package provides solvers for both symmetric and nonsymmetric sparse matrices, stored using the skyline storage scheme. The iterative solver package provides a basic set of storage methods, preconditioners, and iterative solvers. The design of this package is modular and matrix-free, allowing future expansion and easy modification by users. April 1996 Digital Extended Math Library SPD 41.84.04 Version 2.9 for OpenVMS Alpha The Signal Processing Library provides a basic set of signal processing functions. Included are one-, two-, and three-dimensional Fast Fourier Transforms (FFT), group FFT, Fast Cosine/Sine Transforms (FCT/FST), Convolution, Correlation, and Digital Filters. Many DXML subprograms are optimized for the supported hardware plat- forms. Techniques include traditional optimizations such as loop un- rolling and loop reordering. DXML subroutines also provide efficient management of the hierarchical memory system, using techniques such as the following: o Reuse of data within registers to minimize memory accesses; o Efficient cache management; o Use of blocked algorithms that minimize translation buffer misses and unnecessary paging. Since DXML routines can be called from all languages that support the OpenVMS calling standard, the library provides optimized computation for applications written in these languages. Where appropriate, all routines are available in both real and complex versions, as well as in both single and double precision. Both VAX and IEEE floating point formats are supported. DXML Run-Time Only Option The DXML Development Option allows the user to build and run appli- cations on the system on which the DXML Development Option library and license is installed. To allow an application to run on another system without requiring the purchase of another Development Option, Digital provides the DXML Run-Time Only Option. Each additional target system must have a DXML Run-Time Only Option (shared library and license) installed in order to run applications built with the Development Option. The DXML Run-Time Only Option does not permit new applications to be developed. Only one Option type at a time may be present on a system. BLAS Library Contents 2 Digital Extended Math Library SPD 41.84.04 Version 2.9 for OpenVMS Alpha Linear algebra operations are fundamental to many mathematical applications, and several libraries of linear algebra subroutines exist throughout the computer industry. The DXML BLAS library contains the most commonly used linear algebra subroutines. The DXML linear algebra library contains five groups of subroutines at three levels: o Basic Linear Algebra Subroutines (BLAS) Level 1 o BLAS Level 1 Extensions o BLAS Level 1 Sparse Extensions o BLAS Level 2 o BLAS Level 3 In addition, a set of Array Math functions is provided. BLAS Level 1 (Scalar/Vector and Vector/Vector Operations) BLAS Level 1 provides a set of elementary vector functions, operat- ing on one or two vectors. These are typically very small subroutines and depend on assembly language or compiler inlining to make them efficient. DXML contains the 15 standard BLAS Level 1 operations: o The index of the element of a vector having maximum absolute value o The sum of the absolute values of the elements of a vector o Inner product of two real vectors o The sum of a scalar and the extended precision inner product of two real vectors o Conjugated inner product of two complex vectors o Unconjugated inner product of two complex vectors o Square root of the sum of squares (norm) of the elements of a vector o The product of a scalar times the sum of two vectors 3 Digital Extended Math Library SPD 41.84.04 Version 2.9 for OpenVMS Alpha o Copy one vector to another o Apply a Givens rotation o Apply a modified Givens plane rotation o Generate elements for a Givens plane rotation o Generate elements for a modified Givens plane rotation o Product of a vector times a scalar o Swap the elements of two vectors BLAS Level 1 Extensions (Vector/Vector Operations) When developing mathematical algorithms using the BLAS Level 1, sci- entists and engineers found that several additional constructs were used on a regular basis. These constructs have also become well known throughout the computer industry and are known as BLAS Level 1 exten- sions. DXML contains 13 BLAS Level 1 extension operations: o Index of element having the minimum absolute value o Index of element having the maximum value o Index of element having the minimum value o Largest value of the elements of a vector o Smallest value of the elements of a vector o Largest absolute value of the elements of a vector o Smallest absolute value of the elements of a vector o Sum of the values of the elements of a vector o Set all elements of a vector equal to a scalar o Set a vector equal to product of a constant and another vector (y = a x) o Euclidean norm with no intermediate scaling 4 Digital Extended Math Library SPD 41.84.04 Version 2.9 for OpenVMS Alpha o Sum of the squares of the elements of a vector o Set a vector equal to the sum of a constant times a vector plus a vector (z = a x + y) BLAS Level 1 Sparse Extensions (Vector/Vector Operations) This group of operations is similar to the BLAS Level 1 routines, but is designed to work on sparse vectors (vectors in which most of the elements are zero). Six of the routines are equivalent to the dense Level 1 routines, but are modified to work on vectors stored in a com- pressed form. Three additional routines are added for convenience. The nine sparse BLAS Level 1 operations are: o Scalar times a sparse vector plus a vector o Sum of a sparse vector and a full vector o Inner product of a sparse vector and a full vector o Gather a sparse vector from a full vector o Gather a sparse vector from the scaled elements of a full vector o Gather a sparse vector from a full vector and zero the correspond- ing elements of a full vector o Apply Givens rotation to a sparse vector and a full vector o Scatter a sparse vector into a full vector o Scale and scatter a sparse vector into a full vector BLAS Level 2 (Matrix/Vector Operations) The BLAS Level 2 codes make more effective use of the data in the reg- isters, reducing the number of register loads and stores required. In addition, loop unrolling techniques are used to minimize cache misses and page faults. The BLAS Level 2 subroutines use the following types of operations: o Matrix/vector products o Rank-1 and rank-2 matrix updates 5 Digital Extended Math Library SPD 41.84.04 Version 2.9 for OpenVMS Alpha o Solutions of triangular systems of equations Six types of matrices are supported by the BLAS Level 2 routines: o General o General band o Symmetric/Hermitian o Symmetric/Hermitian band o Triangular o Triangular band BLAS Level 3 (Matrix/Matrix Operations) The BLAS Level 3 routines operate at a level that makes the most ef- ficient use of machine resources. DXML optimizes these routines by partitioning matrices into blocks and computing matrix/matrix operations on each block. This approach avoids excessive memory access by providing full reuse of data while each block is in the cache or the registers. BLAS Level 3 routines provide this data blocking for three basic types of operations: o Matrix/matrix products o Rank-k and rank-2k updates of a symmetric matrix o Solving triangular systems of equations with multiple right-hand sides Three types of matrices are supported by these BLAS Level 3 routines: o General o Symmetric/Hermitian o Triangular A set of additional matrix-matrix routines are provided for convenience: o Add two matrices o Subtract one matrix from another 6 Digital Extended Math Library SPD 41.84.04 Version 2.9 for OpenVMS Alpha o Transpose a matrix, in-place or out-of-place Array Math Functions The Array Math Functions provide a set of basic math functions that operate on arrays of numbers rather than on scalars. On vector and su- perscalar architectures, such functions have a performance advantage over a loop of scalar operations. DXML includes the following array math functions for double precision numbers: o Sine of array o Cosine of array o Cosine and sine of array o Exponent of array o Logarithm of array o Square root of array o Reciprocal of array LAPACK Library Contents LAPACK is a library of linear algebra subroutines intended to solve a wide range of problems in linear algebra. LAPACK can be used to solve dense systems of linear equations, linear least squares problems, eigen-value problems, and singular value problems. It is also useful in doing other computations such as matrix factoriza- tions and estimations of condition numbers. The DXML LAPACK Library provides the complete LAPACK package as a shared image - compiled, tested, and ready-to-use. Combined with the optimized BLAS Level 3 routines, DXML LAPACK provides optimal per- formance on all supported platforms. LAPACK should be used in place of LINPACK and EISPACK because it is more efficient, accurate, and robust than those older packages. 7 Digital Extended Math Library SPD 41.84.04 Version 2.9 for OpenVMS Alpha LAPACK supports both real and complex, single and double precision data. It operates on the following types of matrices: o Bidiagonal o General band o General unsymmetric o General tridiagonal o Hermitian o Hermitian, packed storage o Upper Hessenberg, generalized problem o Upper Hessenberg o Orthogonal o Orthogonal, packed storage o Symmetric/Hermitian positive definite band o Symmetric/Hermitian positive definite o Symmetric/Hermitian positive definite, packed storage o Symmetric/Hermitian positive definite tridiagonal o Symmetric band o Symmetric, packed storage o Symmetric tridiagonal o Symmetric o Triangular band o Triangular, generalized problem o Triangular, packed storage o Triangular o Trapezoidal o Unitary 8 Digital Extended Math Library SPD 41.84.04 Version 2.9 for OpenVMS Alpha o Unitary, packed storage LAPACK provides the following operations: o Triangular factorization o Unblocked triangular factorization o Solve a system of linear equations (based on triangular factorization) o Compute the inverse (based on triangular factorization) o Unblocked computation of inverse o Estimate condition number o Refine initial solution returned by solver o Perform QR factorization without pivoting o Unblocked QR factorization o Solve linear least squares problem (based on QR factorization) o Solve the linear equality constrained least squares (LSE) problem o Perform LQ factorization without pivoting o Unblocked LQ factorization o Solve underdetermined linear system (based on LQ factorization) o Generate a real orthogonal or complex unitary matrix as a product of Householder matrices o Unblocked generation of real orthogonal or unitary matrix o Multiply a matrix by a real orthogonal or complex unitary matrix by applying a product of Householder matrices o Unblocked version of multiplication of a matrix by a real orthogonal or complex unitary matrix by applying a product of Householder matrices o Reduce a square matrix to upper Hessenberg form o Unblocked version of square matrix reduction 9 Digital Extended Math Library SPD 41.84.04 Version 2.9 for OpenVMS Alpha o Reduce a symmetric matrix to real symmetric tridiagonal form o Unblocked version of symmetric matrix reduction o Reduce a rectangular matrix to bidiagonal form o Compute various norms of a complex Hermitian tridiagonal matrix o Compute eigenvalues and optional Schur factorization or eigenvec- tors using QR algorithm o Compute selected eigenvectors by inverse iteration o Compute eigenvectors from Schur factorization o Compute eigenvectors using the Pal-Walker-Kahan variant of the QL or QR algorithm o For a pair of N-by-N real nonsymmetric matrices, compute the gen- eralized eigenvalues, the real Schur form, and the left and/or right Schur vectors o For a pair of N-by-N real nonsymmetric matrices, compute the gen- eralized eigenvalues, and the left and/or right generalized eigen- vectors o Compute singular values and, optionally, singular vectors using the QR algorithm o Compute the generalized singular value decomposition (GSVD) on the M-by-N matrix A and P-by-N matrix B o Solve a generalized linear regression model problem Sparse System Solver Library Contents The DXML Sparse System Solver Library contains a set of subroutines that may be used to solve sparse linear systems of equations. Two packages providing direct and iterative methods are supported. Direct Method Sparse Solver Package 10 Digital Extended Math Library SPD 41.84.04 Version 2.9 for OpenVMS Alpha The direct solver package provides skyline (profile) solvers for sym- metric and nonsymmetric matrices. Separate factorization and solver routines are provided to allow repeated use of the solver for multi- ple right hand sides, without repeating the factorization. To make the library easier to use, both simple and expert driver routines are pro- vided. Functions provided include: o LDU factorization o Solver o Norm evaluation o Condition number estimation o Iterative refinement o Simple and expert drivers These storage schemes are supported for symmetric and nonsymmetric matrices: o Profile-in storage o Structurally symmetric, profile-in storage (for nonsymmetric only) o Diagonal-out storage Iterative Method Sparse Solver Package The iterative method package provides a modular set of storage schemes, preconditioners, and solvers. These solvers and preconditioners are easily accessed through an integrated driver routine. Six iterative sparse solvers for real, double precision data are supplied: o Preconditioned conjugate gradient method o Preconditioned least squares conjugate gradient method o Preconditioned bi-conjugate method o Preconditioned conjugate gradient squared method o Preconditioned generalized minimum residual method 11 Digital Extended Math Library SPD 41.84.04 Version 2.9 for OpenVMS Alpha o Preconditioned transpose free QMR method Routines for three storage schemes are provided, or the user may develop routines to employ a custom storage scheme. The supplied storage schemes include: o Symmetric diagonal o Unsymmetric diagonal o General storage by rows Three preconditioners are supplied, which can be selectively applied to the data. Users may also supply custom preconditioners. The pre- conditioners supplied include: o Diagonal o Polynomial (Neumann) o Incomplete LU with zero diagonals added Signal Processing Library Contents The DXML Signal Processing Library routines cover four basic areas of signal processing: o Fast Fourier Transforms (FFT) o Fast Cosine and Fast Sine Transforms (FCT and FST) o Convolution and Correlation o Digital Filters Fast Fourier Transforms and Fast Cosine/Sine Transforms DXML provides one-dimensional, two-dimensional, three-dimensional, and group FFT routines and one-dimensional FCT/FST routines. Each routine is supplied in two forms: 12 Digital Extended Math Library SPD 41.84.04 Version 2.9 for OpenVMS Alpha o The first form computes the transform in one unit operation. This is convenient for programs requiring speed on only one or a few operations. o The second form is provided for programs requiring speed on repeated operations. With this form, each routine is subdivided into three routines. One routine builds the rotation factors, a second routine applies them to the data to perform the transform, and a third routine deallocates any virtual memory allocated in the first routine. Thus, for repeated operations, the rotation factors need to to be built only once. Convolution and Correlation DXML provides routines for computing one-dimensional discrete convo- lutions and correlations. These routines can process both periodic and nonperiodic data. Digital Filters DXML provides support for one-dimensional, nonrecursive digital fil- tering. Based on the Kaisers Sinh-Bessel algorithm, these routines allow programming of bandpass, bandstop, low-pass, and high-pass filters. HARDWARE REQUIREMENTS DXML will operate on any AlphaStation or AlphaServer capable of run- ning Digital UNIX. DXML version 2.9 provides two versions of the libraries, one for each of the Alpha EV4 and EV5 processor implementations. Both versions of the libraries will function correctly on either EV4 or EV5 processors, but may exhibit some performance loss when not run on the designated implementation. Disk Space Requirements (Block Cluster Size = 1) 13 Digital Extended Math Library SPD 41.84.04 Version 2.9 for OpenVMS Alpha Development Option Disk space required for 50,000 blocks installation: (25 MB) Disk space required for 45,000 blocks use (permanent): (22.5 MB) Run-Time Option Disk space required for 50,000 blocks installation: (25 MB) Disk space required for 42,000 blocks use (permanent): (21 MB) These counts refer to the disk space required on the system disk. The sizes are approximate; actual sizes may vary depending on the user's system environment, configuration, and software options. SOFTWARE REQUIREMENTS OpenVMS Alpha Operating System V6.0-V7.0 VMS Tailoring The following OpenVMS classes are required for full functionality of this layered product: o OpenVMS Required Saveset o Programming Support o Utilities For more information on OpenVMS classes and tailoring, refer to the OpenVMS Alpha Operating System Software Product Description (SPD 41.87.xx). 14 Digital Extended Math Library SPD 41.84.04 Version 2.9 for OpenVMS Alpha GROWTH CONSIDERATIONS The minimum hardware/software requirements for any future version of this product may be different from the requirements for the current version. DISTRIBUTION MEDIA CD-ROM This product is also available as part of the OpenVMS Alpha Consol- idated Software Distribution on CD-ROM (QA-03XAA-H8). The software documentation for this product is available as part of the OpenVMS Online Documentation Library on CD-ROM or as a separate hardcopy documentation kit. ORDERING INFORMATION Development Option Software Licenses: QL-MUVA*-** Software Media: QA-MUVAA-H8 Software Documentation: QA-MUVAA-GZ Software Product Services: QT-MUVA*-** Run-Time Option Software Licenses: QL-MUWA*-** Software Media: QA-MUWAA-H8 Software Documentation: QA-MUWAA-GZ Software Product Services: QT-MUWA*-** * Denotes variant fields. For additional information on available licenses, services, and media, refer to the appropriate price book. 15 Digital Extended Math Library SPD 41.84.04 Version 2.9 for OpenVMS Alpha SOFTWARE LICENSING This software is only furnished under a license. For more information about Digital's licensing terms and policies, contact your local Digital office. License Management Facility Support This layered product supports the OpenVMS License Management Facil- ity. License units for this product are allocated on an Unlimited Use Basis. For more information on the License Management Facility, refer to the OpenVMS Alpha Operating System Software Product Description (SPD 41.87.xx) or to the OpenVMS Alpha Operating System documentation set. For more information about Digital's licensing terms and policies, contact your local Digital office. SOFTWARE PRODUCT SERVICES A variety of service options are available. For more information, please contact your local Digital office. SOFTWARE WARRANTY Warranty for this software product is provided by Digital with the purchase of a license for the product as defined in the Software Warranty Addendum. The above information is valid at time of release. Please contact your local Digital office for the most up-to-date information. [TM] The DIGITAL Logo, AlphaGeneration, DEC, DECwindows, Digital, and OpenVMS are trademarks of Digital Equipment Corporation. ©1996 Digital Equipment Corporation. All rights reserved. 16