Previous | Contents | Index |
This chapter presents additional information about the Compaq KAP C
command-line switches and inline pragmas used to inline functions or to
perform interprocedural analysis (IPA).
6.1 Overview
Inlining is the process of replacing a function reference with the text of the function. Inlining eliminates the overhead of the function call, and can assist other optimizations by making relationships between function arguments, returned values, and the surrounding code easier to find.
IPA is the process of inspecting called functions for information on relationships between arguments, returned values, and global data. IPA can provide many of the benefits of inlining, but without replacing the function reference.
The rest of this chapter covers the inlining and IPA command-line
switches and pragmas, related command-line switches, examples of their
use, and information about program constructs that inhibit inlining.
Inlining and IPA are almost symmetrical from the command-line
standpoint --- there are parallel sets of commands and pragmas for
them. The exception is
-inline_depth
. In many places in this chapter, the term "inlining" applies
to both inlining and IPA.
6.2 Inlining and IPA Command-Line Switches
There are two phases to inlining: defining the universe of inlinable functions and selecting which functions in that universe to inline or analyze. The -inline_from... and -ipa_from... switches define the universe of inlinable functions. The -inline/-ipa , -inline_depth , and -..._looplevel switches select which of the available functions are to be inlined/analyzed. The -inline_create and -ipa_create switches set up collections of functions for inclusion in later KAP runs.
All of the inlining and IPA command-line switches are listed in the
following sections. The short forms of their names are in brackets.
6.2.1 -Inline_from and -ipa_from Switches
The -inline_from and -ipa_from switches are:
-inline_from_files=<list> [-inff]
-inline_from_libraries=<list> [-infl]
-ipa_from_files=<list> [-ipaff]
-ipa_from_libraries=<list> [-ipafl]
Where <list> is one or more of the following: source file name, library file name, directory, separated by commas. The default is the current source file.
You can distinguish types of files by their extensions. The option -inline_from_files=xj.c,yy.c,../mrtn/ would look for functions in the C source files xj.c and yy.c , and in C source files in the directory ../mrtn . All source files that contain C preprocessor directives must be preprocessed by cc before being inlined.
The -..._libraries versions of these switches take as their arguments lists of function libraries and directories containing such libraries.
KAP recognizes the type of file from its extension, or lack of one, as follows:
.c --- C source
.klib --- Library from -inline_create/-ipa_create (see Section 6.2.2)
other --- Directory
Two special abbreviations are defined:
"-" --- The current source file (as listed on the command line, or specified in a -input=(<file>) command-line switch .
"." --- The current working directory. Specifying a nonexistent file or directory is a command-line error.
If multiple -inline_from... [-ipa_from...] switches are given, their lists are concatenated to get a bigger universe.
Function name references are resolved by a search in the order that
files appear in
-inline_from... -ipa_from...
switches on the command line. Libraries are searched in their original
lexical order. Multiple
-inline_from...
-ipa_from...
lists are searched in the order that they appear on the command line.
6.2.2 Library Creation
Use the following switches to create a preprocessed library. To specify an existing library file to inline from, use -inline_from_libraries= or -ipa_from_libraries= , as follows:
-inline_create=<library name> [-incr] -ipa_create=<library name> [-ipacr] |
The default source for functions to put into the library is the current source file. If -inline_from... or -ipa_from... is specified, the functions in the listed files are the ones put into the library. This provides a method to combine or expand libraries: just include the old library or libraries in an -inline_from_libraries or -ipa_from_libraries switch, along with an -inline_from_files or -ipa_from_files switch giving source files containing any new functions.
Functions are included in libraries in the order in which they appear in the input files. This is to make sure that if multiple functions with the same name are in the same source file, the one chosen for inlining will be the one you expect from the algorithm under -inline_from... .
A library created with -inline_create will work for inlining or IPA, because it is just partially reduced source code. However, a library created with -ipa_create may not appear in an -inline_from= list. It is flagged with a warning message.
If no library name is given, the name used is file.klib , where file is the input file name with any trailing .c stripped off.
When creating a library, only one -inline_create ( -ipa_create ) switch may be given. That is, only one library may be created per KAP run. If the library file existed prior to running KAP, it is overwritten.
When -inline_create ( -ipa_create ) is specified on the command line, no transformed code file will be generated.
See the description of the -inline_from_libraries and -ipa_from_libraries switches for information about using libraries created with these switches.
If the
-inline
(
-ipa
) switch is not given, or is given without a list of function names,
the default will be to include all the functions from the inlining
universe in the library, if possible. If
-inline=<name list>
or
-ipa=<name list>
is specified, only the named functions will be included in the library.
6.2.3 Naming Specific Functions
The following switches specify names of particular functions to inline. The default is all functions in the function universe specified by any -inline_from... ( -ipa_from... ) switches, subject to the -inline_looplevel -ipa_looplevel and -inline_depth settings.
-inline[=name[,name...]] [-inl=] -ipa[=name[,name...]] [-ipa=] |
Inlining and IPA are off by default. That is, if you do not specify inlining (IPA) switches, then no inlining (IPA) will take place.
If you omit -inline ( -ipa ) from the command line, you can still enable automatic selection of functions to inline (analyze) with one of the -..._from_... switches. You can perform manual selection of functions to inline (analyze) with the -inline_manual ( -ipa_manual ) switches and the inline (IPA) pragmas.
If you specify -inline ( -ipa ) on the command line without a list of function names, then all functions in the inlining (IPA) universe are eligible, subject to the -inline_looplevel ( -ipa_looplevel ) value.
If you specify -inline ( -ipa ) on the command line with a list of function names, then only the functions that are included in the list are eligible, subject to the -inline_looplevel ( -ipa_looplevel ) and -inline_depth values.
The following switches have no versions, but they must have arguments, as follows:
-noinline=name[,name...] [-ninl=] -noipa=name[,name...] [-nipa=] |
These switches enable the automatic inlining (IPA) algorithms in the same way that inline (IPA) does when given without arguments, but the functions listed are ones to NOT be inlined (analyzed). That is, all the functions but the named ones are eligible.
You cannot specify both -inline and -noinline ( -ipa and -noipa ) on the same command line.
If all call sites of a function are to be inlined, the following variant of the -inline switch may be of interest:
-inline_and_copy[=name[,name...]] [-inlc=] |
The -inline_and_copy command-line switch functions like the -inline switch, except that if all references to a function are inlined, the inlined function is copied to the transformed code file unchanged.
When a function has been inlined everywhere it is used, not optimizing it saves compilation time and deleting its text saves memory. These switches are intended for use when the functions being inlined are in the same file as the function reference, and have no special effect when the functions being inlined are being taken from a library or another source file.
These switches assume that all references to a function to be inlined precede it in the source file, and that the file being processed will not be combined or linked with files containing references to the inlined functions. With -inline_and_copy , later references in the same file will either be inlined or execute the unoptimized function; references in other files will execute the unoptimized function. |
The switch -inline_depth=<n> [-ind] sets a maximum level of recursive inlining that KAP will attempt to inline. Recursive inlining means calls to functions with calls to functions with calls to functions and so forth.
The parameter values and their meanings are as follows:
There is no corresponding
-ipa_depth
switch. IPA always looks at the called function, and only at the called
function.
6.2.5 For-Loop Level
The following switches set a minimum for -loop nest level for function call expansion. The -inline_looplevel and -ipa_looplevel switches enable you to limit inlining and IPA to just functions that are referenced in nested loops, where the reduced function call overhead or enhanced optimization will be multiplied:
-inline_looplevel=<n> [-inll] -ipa_looplevel=<n> [-ipall] |
The argument is defined from the most deeply nested leaf of the call tree. A small value restricts inlining (IPA) to the best candidate functions, for example:
main { .. a(); ------> a() {...} .. for (..) { for (..) { b(); ---------> b() {...} for (..) { for (..) { c(); -------> c() {...} } } } } } |
The call to b is inside a doubly nested loop, and would be more profitable to expand than the call to a . The call to c is quadruply nested, so inlining c would yield the biggest gain of the three.
The argument is defined from the most deeply nested function reference:
The following switches cause KAP to recognize the #pragma _KAP [no]inline and #pragma _KAP [no]ipa directives. These switches allow manual control over which functions are inlined/analyzed at which call sites. (See Section 6.3.)
-inline_manual [-inm] -ipa_manual [-ipam] |
The default is to ignore these pragmas. They are enabled when any of the
-inlining
or
-ipa
command-line switches, respectively, are specified. The
-inline_manual
and
-ipa_manual
switches permit you to enable the directives without performing other
inlining.
6.3 Inlining Pragmas
The inline and IPA pragmas tell KAP to inline/IPA the named functions.
#pragma _KAP [no]inline [here|routine|global] [(name[,name...])] #pragma _KAP [no]ipa [here|routine|global] [(name[,name...])] |
The noinline and noipa pragmas tell KAP to not inline/analyze the named functions. These pragmas combine next-statement, entire routine (function), and global (entire program file) scope. If none of the optional elements are included, all functions referenced in the next statement that are in the inlining/analyzing universe are inlined/analyzed on that line.
These pragmas are disabled by default. You can enable them by specifying any of the -inlining (-ipa) command-line switches. Also, you can enable them without enabling any other inlining/IPA with the -inline_manual (-ipa_manual) command-line switch. They are otherwise independent of the other -inlining (-ipa) command switches, and can be used instead of, or in addition to, command-line controlled inlining and IPA.
The keywords including the word pragma must be lowercase. On some systems, the function names are case sensitive.
The effects of scope keywords on pragmas are as follows:
The optional names are function names. If any functions are named in the directive, it applies only to them. If NO function names are given, the pragma applies to ALL functions. The parentheses around the function names are not required if the list of function names is empty.
If a
#pragma _KAP inline
or
#pragma _KAP ipa
names a function not in the inlining or IPA universe, a warning message
is issued, and the pragma is ignored.
6.4 Listing File Support
The optional calling tree and loop tables include the loop nest depth
level of each
for
loop. (See Chapter 8 for examples.) This information can be used to
determine the nest level for function calls for setting
-inline_looplevel
or
-ipa_looplevel
.
6.5 Inlining/IPA Examples
The following code examples demonstrate a few of the possibilities for using the features described in this chapter. Because KAP undergoes constant enhancement, the code that your version of KAP produces may not be identical to that of these examples. The temporary variable names, in particular, can change without significantly altering the transformed code.
Unless otherwise noted, the following examples were run with -o=0 and -so=0 to show the inlining more clearly. If nonzero values are specified, the functions are first inlined or analyzed, and then the concurrent and scalar transformations are applied.
In some cases, C preprocessor additions or modifications to the code
were removed to clarify the example outputs.
6.5.1 Inlining Example
The following example demonstrates inlining with -inline=setup , where only the function setup will be inlined, and with -inline , where both functions are inlined. The KAP output includes optimized versions of both functions, in addition to the expanded main program, as follows:
Source file (before the C preprocessor): #include <math.h> #include <stdio.h> #define SIZE 200 main () { int i,n; double a[SIZE][SIZE], b[SIZE][SIZE], c[SIZE][SIZE]; double cksum, matm(); setup(b,SIZE); setup(c,SIZE); for (n=25; n<=SIZE; n=n+25) { cksum = matm(n,a,b,c); printf("For N= %d checksum= %g \n", n, cksum); } } setup (e,n) double e[SIZE][SIZE]; int n; { int i,j; for(i=0; i<n; i++) { for (j=0; j<n; j++) e[i][j] = ( (i+ 7*j) % 10 )/10.0; } return; } double matm (n,a,b,c) int n; double a[SIZE][SIZE], b[SIZE][SIZE], c[SIZE][SIZE]; { int i,j,k; for (i=0; i<n; i++) for (j=0; j<n; j++) { a[i][j] = 0.0 ; for (k=0; k<n; k++) a[i][j] = a[i][j] + b[i][k]*c[k][j]; } return (a[3][5]); } |
The main function generated by -inline=setup is as follows:
int main( ) { int i; int n; double a[200][200]; double b[200][200]; double c[200][200]; double cksum; double matm( ); int _Kii3; int j; int _Kii6; int _Kii7; for ( _Kii3 = 0; _Kii3<=199; _Kii3++ ) { for ( j = 0; j<=199; j++ ) { b[_Kii3][j] = ((_Kii3 + j * 7) % 10) / 10.0; } } for ( _Kii6 = 0; _Kii6<=199; _Kii6++ ) { for ( _Kii7 = 0; _Kii7<=199; _Kii7++ ) { c[_Kii6][_Kii7] = ((_Kii6 + _Kii7 * 7) % 10) / 10.0; } } for ( n = 25; n<=200; n+=25 ) { cksum = matm( n, a, b, c ); printf( "For N= %d checksum= %g \n", n, cksum ); } } |
The main function generated by -inline is as follows:
int main( ) { int i; int n; double a[200][200]; double b[200][200]; double c[200][200]; double cksum; double matm( ); double _Kaa1; int _Kii2; int j; int k; int _Kii5; int _Kii6; int _Kii9; int _Kii10; for ( _Kii5 = 0; _Kii5<=199; _Kii5++ ) { for ( _Kii6 = 0; _Kii6<=199; _Kii6++ ) { b[_Kii5][_Kii6] = ((_Kii5 + _Kii6 * 7) % 10) / 10.0; } } for ( _Kii9 = 0; _Kii9<=199; _Kii9++ ) { for ( _Kii10 = 0; _Kii10<=199; _Kii10++ ) { c[_Kii9][_Kii10] = ((_Kii9 + _Kii10 * 7) % 10) / 10.0; } } for ( n = 25; n<=200; n+=25 ) { for ( _Kii2 = 0; _Kii2<n; _Kii2++ ) { for ( j = 0; j<n; j++ ) { a[_Kii2][j] = 0.0; for ( k = 0; k<n; k++ ) { a[_Kii2][j] += b[_Kii2][k] * c[k][j]; } } } _Kaa1 = a[3][5]; cksum = _Kaa1; printf( "For N= %d checksum= %g \n", n, cksum ); } } |
In the following example, the variables n and np1 have a simple relationship. This relationship is hidden behind a function call, however, so KAP normally does not try to concurrentize the loop in the main program. When the -ipa=rxgfs command-line switch is specified, KAP inspects the named function for information on the relationship of its arguments and returned value and the surrounding code. The assumed dependence is lifted and the loop can be safely concurrentized. If a function cannot be inlined, or if you do not want to inline it, it can often still be analyzed for its effects on the calling function.
The following example was run with the default values for -optimize and -scalaropt .
main() { int np1, i, m, n; int a[100][100]; np1 = rxgfs( n ); for ( i=0; i<m; i++ ) { a[i][n] = a[i-1][np1]; } } int rxgfs( n ) int n; { return (n+1); } |
Becomes:
int main( ) { int np1; int i; int m; int n; int a[100][100]; np1 = rxgfs ( n ) ; { for ( i = 0; i<m; i++ ) { a[i][n] = a[i-1][np1]; } } } |
The subfunction was not shown.
6.5.3 Recursive Inlining Examples
The -inline_depth command-line switch sets the maximum level of function nesting, that is, calls to functions with calls to functions and so forth, that KAP will attempt to trace. Higher values cause KAP to recursively inline more deeply.
When run with
-inline_depth=1
, meaning inline only one function deep, all the function references in
the main program and functions are expanded, but function references in
inlined functions are not.
6.6 Additional Information on Inlining and IPA
Functions to be inlined must pass all the criteria ( inline, inline_depth, inline_looplevel ) to be inlined. The following directives are exceptions to this rule.
The #pragma _KAP [no]inline and #pragma _KAP [no]ipa directives, when enabled, override the -inline and -ipa command-line switches, respectively.
A #pragma _KAP inline global directive without a function name list tells KAP to inline every function possible, regardless of the inline , inline_depth , and inline_looplevel settings. A #pragma _KAP noinline global directive tells KAP not to inline anything, regardless of the inline , inline_depth , and inline_looplevel settings.
When a library is specified with -inline_from_libraries , functions may be taken from that library for inlining into the source code. No attempt is made to inline functions from the source file into functions from the library. For example, if the main program calls function bb , which is in the library, and bb calls function dd , which is in the source file, then bb can be inlined into the main program, but KAP will not attempt to inline dd into the text from library function bb .
A library created with -inline_create will work for inlining or IPA because it is just partially reduced source code. However, a library created with -ipa_create may not appear in an -inline_from_libraries= list; it is flagged with a Warning message.
Inlining and IPA are slow, memory-intensive activities. Specifying
-inline_depth=10
-inline_looplevel=big
, that is, inline all available functions everywhere they are used, for
a large set of inlinable functions for a large source file can absorb
significant system resources. For most programs, specifying small
values for
-inline_depth
and
-inline_looplevel
and/or a small number of functions with
-inline=
can provide most of the benefits of inlining. The same applies for the
corresponding
-ipa
command-line switches.
6.7 Inhibiting Inlining
The following list shows the conditions that inhibit the inlining of functions, whether from a library or source file:
See Section 6.6 for information on the use of the inlining command-line switches and pragmas.
Previous | Next | Contents | Index |