HP OpenVMS Systems Documentation

Content starts here

HP TCP/IP Services for OpenVMS
ONC RPC Programming


Previous Contents Index


Chapter 4
External Data Representation

This chapter describes the external data representation (XDR) standard, a set of routines that enable C programmers to describe arbitrary data structures in a system-independent way. For a formal specification of the XDR standard, see RFC 1014: XDR: External Data Representation Standard.

XDR is the backbone of ONC RPC, because data for remote procedure calls is transmitted using the XDR standard. ONC RPC uses the XDR routines to transmit data that is read or written from several types of systems. For a complete specification of the XDR routines, see Chapter 8.

This chapter also contains a short tutorial overview of the XDR routines, a guide to accessing currently available XDR streams, and information on defining new streams and data types.

XDR was designed to work across different languages, operating systems, and computer architectures. Most users (particularly RPC users) only need the information on number filters ( Section 4.2.1), floating-point filters ( Section 4.2.2) and enumeration filters ( Section 4.2.3). Programmers who want to implement RPC and XDR on new systems should read the rest of the chapter.

Note

You can use RPCGEN to write XDR routines regardless of whether RPC calls are being made.

C programs that need XDR routines must include the file <rpc/rpc.h> , which contains all necessary interfaces to the XDR system. The object library contains all the XDR routines, so you can link as you usually would when using a library. If you wish to use a shareable version of the library, reference the library SYS$SHARE:TCPIP$RPCXDR_SHR in your LINK options file.

4.1 Usefulness of XDR

Consider the following two programs, writer.c and reader.c :


#include <stdio.h>

main()                  /* writer.c */
{
     long i;

     for (i = 0; i < 8; i++) {
          if (fwrite((char *)&i, sizeof(i), 1, stdout) != 1) {
               fprintf(stderr, "failed!\n");
               exit(1);
          }
     }
     exit(0);
}

#include <stdio.h>

main()                  /* reader.c */
{
     long i, j;

     for (j = 0; j < 8; j++) {
          if (fread((char *)&i, sizeof (i), 1, stdin) != 1) {
               fprintf(stderr, "failed!\n");
               exit(1);
          }
          printf("%ld ", i);
     }
     printf("\n");
     exit(0);
}

The two programs appear to be portable because:

  • They pass lint checking.
  • They work the same when executed on two different hardware architectures, Sun Microsystem's SPARC architecture and HP's OpenVMS Alpha or I64 architecture.

Piping the output of the writer.c program to the reader.c program gives identical results on an Alpha computer and on a Sun computer, as shown:


sun% writer | reader
0 1 2 3 4 5 6 7
sun%

$ writer | reader
0 1 2 3 4 5 6 7
$

With local area networks and Berkeley UNIX 4.2 BSD came the concept of network pipes, in which a process produces data on one system, and a second process on another system uses this data. You can construct a network pipe with writer.c and reader.c . Here, the first process (on a Sun computer) produces data used by a second process (on an HP Alpha computer):


sun% writer | rsh alpha reader
0 16777216 33554432 50331648 67108864 83886080 100663296
117440512
sun%

You get identical results by executing writer.c on the HP Alpha computer and reader.c on the Sun computer. These results occur because the byte ordering of long integers differs between the Alpha computer and the Sun computer, although the word size is the same. Note that 16777216 is equal to 224. When 4 bytes are reversed, the 1 is in the 24th bit.

Whenever data is shared by two or more system types, there is a need for portable data. You can make programs data-portable by replacing the read and write calls with calls to an XDR library routine xdr_long , which is a filter that recognizes the standard representation of a long integer in its external form. Here are the revised versions of writer.c and reader.c :


/*        Revised Version of writer.c       */


#include <stdio.h>
#include <rpc/rpc.h>    /* xdr is a sub-library of rpc */

main()          /* writer.c */
{
     XDR xdrs;
     long i;
     xdrstdio_create(&xdrs, stdout, XDR_ENCODE);
     for (i = 0; i < 8; i++) {
          if (!xdr_long(&xdrs, &i)) {
               fprintf(stderr, "failed!\n");
               exit(1);
          }
     }
     exit(0);
}

/*        Revised Version of reader.c      */

#include <stdio.h>
#include <rpc/rpc.h>    /* XDR is a sub-library of RPC */

main()          /* reader.c */
{
     XDR xdrs;
     long i, j;
     xdrstdio_create(&xdrs, stdin, XDR_DECODE);
     for (j = 0; j < 8; j++) {
          if (!xdr_long(&xdrs, &i)) {
               fprintf(stderr, "failed!\n");
               exit(1);
          }
          printf("%ld ", i);
     }
     printf("\n");
     exit(0);
}

The new programs were executed on an Alpha computer, a Sun computer, and from a Sun computer to an Alpha computer; the results are as follows:


sun% writer | reader
0 1 2 3 4 5 6 7
sun%

$ writer | reader
0 1 2 3 4 5 6 7
$

sun% writer | rsh alpha reader
0 1 2 3 4 5 6 7
sun%

Note

Arbitrary data structures create portability problems, particularly with alignment and pointers:
  • Alignment on word boundaries may cause the size of a structure to vary on different systems.
  • A pointer has no meaning outside the system where it is defined.

4.1.1 A Canonical Standard

The XDR approach to standardizing data representations is canonical, because XDR defines a single byte order (big-endian), a single floating-point representation (IEEE), and so on. A program running on any system can use XDR to create portable data by translating its local representation to the XDR standard. Similarly, any such program can read portable data by translating the XDR standard representation to the local equivalent.

The single standard treats separately those programs that create or send portable data and those that use or receive the data. A new system or language has no effect on existing portable data creators and users. Any new system simply uses the canonical standards of XDR; the local representations of other system are irrelevant. To existing programs on other systems, the local representations of the new system are also irrelevant. There are strong precedents for the canonical approach of XDR. For example, TCP/IP, UDP/IP, XNS, Ethernet, and all protocols below layer 5 of the ISO model, are canonical protocols. The advantage of any canonical approach is simplicity; in the case of XDR, a single set of conversion routines is written once.

The canonical approach does have one disadvantage of little practical importance. Suppose two little-endian systems transfer integers according to the XDR standard. The sending system converts the integers from little-endian byte order to XDR (big-endian) byte order, and the receiving system does the reverse. Because both systems observe the same byte order, the conversions were really unnecessary. Fortunately, the time spent converting to and from a canonical representation is insignificant, especially in networking applications. Most of the time required to prepare a data structure for transfer is not spent in conversion but in traversing the elements of the data structure.

4.1.2 The XDR Library

The XDR library enables you to write and read arbitrary C constructs consistently. This makes it useful even when the data is not shared among systems on a network. The XDR library can do this because it has filter routines for strings (null-terminated arrays of bytes), structures, unions, and arrays. Using more primitive routines, you can write your own specific XDR routines to describe arbitrary data structures, including elements of arrays, arms of unions, or objects pointed at from other structures. The structures themselves may contain arrays of arbitrary elements, or pointers to other structures.

The previous writer.c and reader.c routines manipulate data by using standard I/O routines, so xdrstdio_create was used. The parameters to XDR stream creation routines vary according to their function. For example, xdrstdio_create takes the following parameters:

  • A pointer to an XDR structure that it initializes
  • A pointer to a FILE that the input or output acts upon
  • The operation---either XDR_ENCODE for serializing in writer.c or XDR_DECODE for deserializing in reader.c

It is not necessary for RPC users to create XDR streams; the RPC system itself can create these streams and pass them to the users. There is a family of XDR stream creation routines in which each member treats the stream of bits differently.

The xdr_long primitive is characteristic of most XDR library primitives and all client XDR routines for two reasons:

  • The routine returns FALSE (0) if it fails and TRUE (1) if it succeeds.
  • For each data type xxx, there is an associated XDR routine of the following form:


    xdr_xxx(xdrs, xp)
         XDR *xdrs;
         xxx *xp;
    {
    }
    

In this case, xxx is long , and the corresponding XDR routine is a primitive, xdr_long . The client could also define an arbitrary structure xxx ; in this case, the client would also supply the routine xdr_xxx , describing each field by calling XDR routines of the appropriate type. In all cases, the first parameter, xdrs , is treated as an opaque handle and passed to the primitive routines.

XDR routines are direction independent; that is, the same routines are called to serialize or deserialize data. This feature is important for portable data. Calling the same routine for either operation practically guarantees that serialized data can also be deserialized. Thus, one routine is used by both the producer and the consumer of networked data.

You implement direction independence by passing a pointer to an object rather than the object itself (only with deserialization is the object modified). If needed, the user can obtain the direction of the XDR operation. See Section 4.3 for details.

For a more complicated example, assume that a person's gross assets and liabilities are to be exchanged among processes, and each is a separate data type:


struct gnumbers {
     long g_assets;
     long g_liabilities;
};

The corresponding XDR routine describing this structure would be as follows:


bool_t                  /* TRUE is success, FALSE is failure */
xdr_gnumbers(xdrs, gp)
     XDR *xdrs;
     struct gnumbers *gp;
{
     if (xdr_long(xdrs, &gp->g_assets) &&
       xdr_long(xdrs, &gp->g_liabilities))
          return(TRUE);
     return(FALSE);
}

In the preceding example, the parameter xdrs is never inspected or modified; it is only passed to subcomponent routines. The program must inspect the return value of each XDR routine call and stop immediately and return FALSE upon subroutine failure.

The preceding example also shows that the type bool_t is declared as an integer whose only value is TRUE (1) or FALSE (0). The following definitions apply:


#define bool_t  int
#define TRUE    1
#define FALSE   0

With these conventions, you can rewrite xdr_gnumbers as follows:


bool_t
xdr_gnumbers(xdrs, gp)
     XDR *xdrs;
     struct gnumbers *gp;
{
     return(xdr_long(xdrs, &gp->g_assets) &&
       xdr_long(xdrs, &gp->g_liabilities));
}

Either coding style can be used.

4.2 XDR Library Primitives

The following sections describe the XDR primitives--- basic and constructed data types---and XDR utilities. The include file <rpc/xdr.h> (automatically included by <rpc/rpc.h> ), defines the interface to these primitives and utilities.

4.2.1 Number and Single-Character Filters

The XDR library provides primitives that translate between numbers and single characters and their corresponding external representations. Primitives include the set of numbers in:


[signed, unsigned] * [char, short, int, long, hyper]

Specifically, the ten primitives are:


bool_t xdr_char(xdrs, cp)
     XDR *xdrs;
     char *cp;

bool_t xdr_u_char(xdrs, ucp)
     XDR *xdrs;
     unsigned char *ucp;

bool_t xdr_short(xdrs, sip)
     XDR *xdrs;
     short *sip;

bool_t xdr_u_short(xdrs, sup)
     XDR *xdrs;
     u_short *sup;

bool_t xdr_int(xdrs, ip)
     XDR *xdrs;
     int *ip;

bool_t xdr_u_int(xdrs, up)
     XDR *xdrs;
     unsigned *up;

bool_t xdr_long(xdrs, lip)
     XDR *xdrs;
     long *lip;

bool_t xdr_u_long(xdrs, lup)
     XDR *xdrs;
     u_long *lup;

bool_t xdr_hyper(xdrs, HP)
     XDR *xdrs;
     longlong_t *hp;

bool_t xdr_u_hyper(xdrs, uhp)
     XDR *xdrs;
     u_longlong_t *uhp;

The first parameter, xdrs , is a pointer to an XDR stream handle. The second parameter is a pointer to the number that provides data to the stream or receives data from it. All routines return TRUE if they complete successfully and FALSE if they do not.

For more information on number filters, see Chapter 8.

4.2.2 Floating-Point Filters

The XDR library also provides primitive routines for floating-point types in C:


bool_t xdr_float(xdrs, fp)
     XDR *xdrs;
     float *fp;

bool_t xdr_double(xdrs, dp)
     XDR *xdrs;
     double *dp;

The first parameter, xdrs , is a pointer to an XDR stream handle. The second parameter is a pointer to the floating-point number that provides data to the stream or receives data from it. Both routines return TRUE if they complete successfully and FALSE if they do not.

Note

Because the numbers are represented in IEEE floating-point format over the network, routines may fail when decoding a valid IEEE representation into a system-specific representation, or vice versa.

To control the local representation of floating point numbers, you can choose the floating-point type when you compile your RPC program or you can use different XDR routines to explicitly control the local representation. For more information about floating-point filters, see the xdr_double and xdr_float routines in Chapter 8.

4.2.3 Enumeration Filters

The XDR library provides a primitive for generic enumerations; it assumes that a C enum has the same representation inside the system as a C integer . The bool_t (boolean) type is an important instance of the enum type. The external representation of a bool_t type is always TRUE (1) or FALSE (0), as shown here:


#define bool_t  int
#define FALSE   0
#define TRUE    1
#define enum_t int

bool_t xdr_enum(xdrs, ep)
     XDR *xdrs;
     enum_t *ep;

bool_t xdr_bool(xdrs, bp)
     XDR *xdrs;
     bool_t *bp;

The second parameters ep and bp are pointers to the enumerations or booleans that provide data to or receive data from the stream xdrs .

For more information about enumeration filters, see Chapter 8.

4.2.4 Possibility of No Data

Occasionally, an XDR routine must be supplied to the RPC system, even when no data is passed or required. The following routine does this:


bool_t xdr_void();  /* always returns TRUE */

4.2.5 Constructed Data Type Filters

Constructed or compound data type primitives require more parameters and perform more complicated functions than the primitives previously discussed. The following sections include primitives for strings, arrays, unions, and pointers to structures.

Constructed data type primitives may use memory management. In many cases, memory is allocated when deserializing data with XDR_DECODE . XDR enables memory deallocation through the XDR_FREE operation. The three XDR directional operations are XDR_ENCODE , XDR_DECODE , and XDR_FREE .

For more information about constructed data filters, see Chapter 8.

4.2.5.1 Strings

In C, a string is defined as a sequence of bytes terminated by a NULL byte, which is not considered when calculating string length. When a string is passed or manipulated, there must be a pointer to it. Therefore, the XDR library defines a string to be a char * , not a sequence of characters. The external and internal representations of a string are different. Externally, strings are represented as sequences of ASCII characters; internally, with character pointers. The xdr_string routine converts between the two, as follows:


bool_t xdr_string(xdrs, sp, maxlength)
     XDR *xdrs;
     char **sp;
     u_int maxlength;

The first parameter, xdrs , is the XDR stream handle; the second, sp , is a pointer to a string (type char **). The third parameter, maxlength , specifies the maximum number of bytes allowed during encoding or decoding; its value is usually specified by a protocol. For example, a protocol may specify that a file name cannot be longer than 255 characters. Keep maxlength small because overflow conditions may occur if xdr_string has to call malloc for space. The routine returns FALSE if the number of characters exceeds maxlength ; otherwise, it returns TRUE.

The behavior of xdr_string is similar to that of other routines in this section. For the direction XDR_ENCODE , the parameter sp points to a string of a certain length; if the string does not exceed maxlength , the bytes are serialized.

For the direction XDR_DECODE , the effect of deserializing a string is subtle. First, the length of the incoming string is determined; it must not exceed maxlength . Next, sp is dereferenced; if the value is NULL , then a string of the appropriate length is allocated and *sp is set to this string. If the original value of *sp is not NULL , then XDR assumes that a target area (which can hold strings no longer than maxlength ) has been allocated. In either case, the string is decoded into the target area, and the routine appends a NULL character to it.

In the XDR_FREE operation, the string is obtained by dereferencing sp . If the string is not NULL , it is freed and *sp is set to NULL . In this operation, xdr_string ignores the maxlength parameter.

4.2.5.2 Variable-Length Byte Arrays

Often, variable-length arrays of bytes are preferable to strings. Byte arrays differ from strings in the following three ways:

  1. The length of the array (the byte count) is located explicitly in an unsigned integer.
  2. The byte sequence is not terminated by a NULL character.
  3. The external and internal byte representation is the same.

The primitive xdr_bytes converts between the internal and external representations of byte arrays:


bool_t xdr_bytes(xdrs, bpp, lp, maxlength)
     XDR *xdrs;
     char **bpp;
     u_int *lp;
     u_int maxlength;

The usage of the first, second, and fourth parameters are identical to the same parameters of xdr_string ( Section 4.2.5.1). The length of the byte area is obtained by dereferencing lp when serializing; *lp is set to the byte length when deserializing.


Previous Next Contents Index