To pack or not to pack

Consider these accuracy and interoperability caveats . . .

rule

Dick Valent
by Dick Valent


Contents

Search

Index

Home

Subscribe

ConsultWeb

SCD

Introduction
This article explains several accuracy and interoperability caveats of the NCAR packing and unpacking routines and provides some details of the software. If you use these routines, this article may help you anticipate some unpleasant numerical surprises.

NCAR provides these packing and unpacking routines for historical reasons. Programmers needed them in the late 70s when computer memory was extremely limited. Today's programmers are likely to avoid packing because computer memory is more abundant and because packing loses numerical precision.

The NCAR routines are different from CRAY's PACK and UNPACK routines. To avoid confusion in this article, I refer to NCAR's version as "the NCAR packing and unpacking routines."


Some definitions

The term "packing" here means an array operation in which an input array of N REAL*8 floating-point numbers yields one offset, one scale factor, and N integers INTG(1), ..., INTG(N), and in which the integers are packed two, three, or four per 64-bit word. The value two, three, or four is known as "packing density" and its specification is under user control.

The term "unpacking" is the approximate-inverse operation whereby each of the N packed integers is taken from the packed format and a new REAL*8 array calculated as

OFFSET + INTG(K)*SCALEFACTOR

for K=1,...,N. Each element of this new array is an approximation to the corresponding element of the original array.

How accurate is this? There is a penalty and sometimes it is severe, as the following discussion shows.


Accuracy loss

Accuracy loss is inherent to the packing algorithm; the severity depends on the range of values to be packed and the packing density specified by the user. The concept of "relative error" further explains the situation.

If A is a known nonzero quantity and B is an approximation to A, the "relative error of the approximation" is defined as:

ABS(A-B)/ABS(A)

For certain datasets, the NCAR packing routines produce large relative errors when elements of the packed-unpacked array are compared to the corresponding elements of the original array. Arrays that span a large range of numerical values are vulnerable to large errors.

Consider the following array A with 30 elements:

1.0, 2.0, 3.0, ..., 10.0, 110000.0, 120000.0, ..., 300000.0

When array A is packed with packing density 4, then unpacked to array B on SCD's CRAY C90 computer, the relative error of B(3) is approximately 0.66667. This is poor because no significant digits are preserved. You must decide whether such loss is acceptable, especially when forecasts or iterative procedures may further propogate errors.


Exact zeros not preserved

There is no guarantee zeros are preserved in a pack-unpack operation, because accuracy is lost in the packing operation. In the following 30-element array, the 0.0 element pack-unpacks to -0.13036E-08 on SCD's CRAY C90 computer:

0.0, -1.0, -2.0, ..., -9.0, 110000.0, 120000.0, ..., 300000.0


Floating-point overflow

If the vector X of floating-point numbers to be packed satisfies

ABS(XMIN*XMAX) > 1.OE + 2465

where XMIN is the minimum value of array X and XMAX is the maximum value of X, then the NCAR packing routines PACKx and PACKxF will exit on a fatal floating-point exception error on Cray Parallel Vector Processor computers (such as SCD's C90) that run Cray floating-point arthithmetic.


Poor interoperability

Data packed on one computer will certainly unpack differently on another, due to differences in roundoff error. The 30-element array provided under the "Exact zeroes not preserved" section, above, shows a difference of two orders of magnitude when you compare unpacked results on NCAR's CRAY C90 and SGI PowerChallenge.


Poor portability

The routines are difficult to port because they require SHIFT and logical functions on 64-bit words, and these functions may not be provided by the target machine's vendor.


All-Fortran version for Cray vector computers

The packing routine is named PACKAF and the unpacking routine is UNPKAF. (Routines PACKBF, PACKCF, UNPKBF, and UNPKCF are also available, but I do not recommend them because they perform poorly in certain situations, as explained in the comment lines of the source code.)

Here's how PACKAF works: Given a REAL*8 array ARN(NRN) of real numbers, and given packing density NPK with value 2, 3, or 4, and given real numbers AMN and AMX (the former less than or equal to any datum in ARN and the latter greater than or equal to any datum in ARN), each element of ARN is reduced to a positive integer less than or equal to 2**(64/NPK)-1 by the formulas:

XSC = (2**(64/NPK)-1) / (AMX-AMN)
XSCINV = 1.0 /XSC
INTG(I) = INTEGER( (ARN(I)-AMN) * XSC + .5 )
ARN(I) = AMN + INTG(I) * XSCINV

The numbers AMN and (2**(64/NPK)-1) / (AMX-AMN) are stored in array elements APN(1) and APN(2), followed by the integers (INT(I),I=1,NRN), packed NPK per word in APN(3) ,..., APN(2+(NRN+NPK-1)/NPK).

Here's how routine UNPKAF works: The original array ARN is approximated from the contents of the packed array APN by means of the following formula:

ARN(I) = AMN + INTG(I) * (AMX-AMN) / (2**(64/NPK)-1)


Source code

You can obtain Fortran source code via anonymous FTP from ftp.ucar.edu by following this FTP session after logging on:

  ftp> cd dsl/lib/ncaro
  ftp> get packf.f
  ftp> quit

A CAL version of the NCAR packing and unpacking routines is also available, but I do not recommend using it because CAL is more difficult to maintain over changes to the Cray programming environment.

If you think you need the CAL version, however, please contact me via e-mail (valent@ucar.edu).


References

None known, other than this article and comment lines in source code.


Questions and concerns

If you have questions about these routines on NCAR's CRAY computers, please contact me via e-mail (valent@ucar.edu).

rule

Contents || Search || Index || Home || Subscribe || ConsultWeb || SCD