SUMMARY -- inverse bit or endian differences

From: Gerard Nussbaum <gerard_n_at_ix.netcom.com>
Date: Wed, 19 Apr 1995 08:14:37 -0700

The original query was:
> We are planning to migrate a third-party application from an Alpha 2100
> to either a DG or HP platform. The software vendor has come back to us
> and said that there is no way to transfer the database file to the new
> system because DEC stores the file with an inverse bit.
>
> It appears that the inverse bit, in the vendor's mind, will result in
> data scrambling and is therefore recommending rekeying of the data.
>
> Perhaps I missed something, but If I just ftp the file to the new box
> it should be irrelevant how the DEC is storing the data since I will be
> using the same database software (a MUMPS derivative) on both systems.



Many thanks to all who responded.

"Merton Campbell Crockett" <mcc_at_WLV.IIPO.GTEGSC.COM>
"Danny J. Mitzel" <dmitzel_at_everest.hitc.com>
scottm_at_disc-synergy.com (Scott McCracken)
Brad Davis <bdavis_at_zinc.com>
alan_at_nabeth.cxo.dec.com
"Dr. Tom Blinn, 603-881-0646" <tpb_at_zk3.dec.com>
Paul E. Rockwell <rockwell_at_rch.dec.com>
Todd Kover <kovert_at_cs.UMD.EDU>
Bob Grandle <GRANDLE_at_acodbob.larc.nasa.gov>
Ram Murthy <ram_at_intercap.com>
Stam Nicolis <nicolis_at_celfi.phys.univ-tours.fr>
Andrew Gallatin <gallatin_at_isds.Duke.EDU>
John Kohl <jtk_at_atria.com>


The majority of the replies pointed out that the issue is that DEC builds
little endian machines, i.e., an address refers to the least significant
bit of the storage unit, whereas the targe platform is a"big endian" mac
hine, i.e., an address refers the most significant bit of a storage unit,
such as HP, DG, Sun, IBM. Intel architecture, BTW, is little endian.

Since the data cannot be directly transferred via ftp (either binary or
ascii) the best approach is to convert to a token-separated ascii file on
the source system, then transfer the file to the target system and reconver
t to the data format used on the target machine.

This is quite a bit easier, as it only requires knowledge of the
application data formats and structures.

The major issue is the creation of the MUMPS programs on both the source
and target platforms to perform the data conversion to and from ascii.
This is something that the vendor will need to address since they have
control over the data formats and structures.


One very specific suggestion, which we have not tried yet :
    To convert raw data, you need only use dd with the conv=swab option.
    Read the dd man page for details.


Further details on endianness:

The difference big-endian vs. little-endian machines is the order of the
bytes in multibyte values. for a 32-bit value [4 bytes] the byte order of
big would be [byte1 byte2 byte3 byte4] the other machine byte order would
 be [byte4 byte3 byte2 byte1]. lets say you have the value '1' in your
database as [00000000 00000000 00000000 00000001] now tranfer it to the
other machine and it's interpretted as [00000001 00000000 00000000
00000000] = 2^24! your situation could be further complicated by the 64
bit integers on the DEC.


Digital UNIX systems, like Intel-based systems, stores data in little
endian format
(low order byte in register gets stored first in memory byte address, next
lower comes next, etc.)


    32 bit integer
          byte number

       1 2 3 4 on hp or DG

       4 3 2 1 on DEC and ibm pc
Received on Wed Apr 19 1995 - 11:15:28 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:45 NZDT