
by Dick Valent
|
Introduction
This article explains several accuracy and
interoperability caveats of the NCAR packing and unpacking routines and
provides some details of the software. If you use these routines, this article may help you anticipate some unpleasant numerical surprises.
NCAR provides these packing and unpacking routines for historical reasons. Programmers needed them in the late 70s when
computer memory was extremely limited. Today's programmers are likely to
avoid packing because computer memory is more abundant and because packing loses numerical precision.
The NCAR routines are different
from CRAY's PACK and UNPACK routines. To avoid confusion in this article, I refer to NCAR's version as "the NCAR packing and unpacking routines."
Some definitions
The term "packing" here means an array operation in which an input array
of N REAL*8 floating-point numbers yields one offset, one scale factor,
and N integers INTG(1), ..., INTG(N), and in which the integers are packed two, three, or four per 64-bit word. The value two, three, or four is known as "packing density" and its specification is under user control.
The term "unpacking" is the approximate-inverse operation whereby
each of the N packed integers is taken from the packed format and a new
REAL*8 array calculated as
OFFSET + INTG(K)*SCALEFACTOR
for K=1,...,N. Each element of this new array is an approximation to the
corresponding element of the original array.
How accurate is this? There
is a penalty and sometimes it is severe, as the following discussion shows.
Accuracy loss
Accuracy loss is inherent to the packing algorithm; the severity depends
on the range of values to be packed and the packing density specified by the
user. The concept of "relative error" further explains the
situation.
If A is a known nonzero quantity and B is an approximation to A,
the "relative error of the approximation" is defined as:
ABS(A-B)/ABS(A)
For certain datasets, the NCAR packing routines produce large relative errors
when elements of the packed-unpacked array are compared to the corresponding
elements of the original array. Arrays that span a large range of numerical
values are vulnerable to large errors.
Consider the following array A with 30 elements:
1.0, 2.0, 3.0, ..., 10.0, 110000.0, 120000.0, ..., 300000.0
When array A is packed with packing density 4, then unpacked to array B
on SCD's CRAY C90 computer, the relative error of B(3) is approximately
0.66667. This is poor because no significant digits are preserved. You
must decide whether such loss is acceptable, especially when forecasts
or iterative procedures may further propogate errors.
Exact zeros not preserved
There is no guarantee zeros are preserved in a pack-unpack operation, because
accuracy is lost in the packing operation. In the following 30-element
array, the 0.0 element pack-unpacks to -0.13036E-08 on SCD's CRAY C90 computer:
0.0, -1.0, -2.0, ..., -9.0, 110000.0, 120000.0, ..., 300000.0
Floating-point overflow
If the vector X of floating-point numbers to be packed satisfies
ABS(XMIN*XMAX) > 1.OE + 2465
where XMIN is the minimum value of array X and XMAX is the maximum value
of X, then the NCAR packing routines PACKx and PACKxF will exit on a fatal
floating-point exception error on Cray Parallel Vector Processor computers (such as SCD's C90) that run Cray floating-point arthithmetic.
Poor interoperability
Data packed on one computer will certainly unpack differently on
another, due to differences in roundoff error. The 30-element
array provided under the "Exact zeroes not preserved" section,
above, shows a difference of
two orders of magnitude when you compare unpacked results on NCAR's
CRAY C90 and SGI PowerChallenge.
Poor portability
The routines are difficult to port because
they require SHIFT and logical functions on 64-bit words, and these
functions may not be provided by the target machine's vendor.
All-Fortran version for Cray vector computers
The packing routine is named PACKAF and the unpacking routine is UNPKAF. (Routines PACKBF, PACKCF, UNPKBF, and UNPKCF are also available, but I do not recommend them because they perform poorly in certain situations, as explained in the comment lines of the source code.)
Here's how PACKAF works: Given a REAL*8 array ARN(NRN) of real numbers, and given packing density
NPK with value 2, 3, or 4, and given real numbers AMN and AMX (the former
less than or equal to any datum in ARN and the latter greater than or
equal to any datum in ARN), each element of ARN is reduced to a
positive integer less than or equal to 2**(64/NPK)-1 by the formulas:
XSC = (2**(64/NPK)-1) / (AMX-AMN)
XSCINV = 1.0 /XSC
INTG(I) = INTEGER( (ARN(I)-AMN) * XSC + .5 )
ARN(I) = AMN + INTG(I) * XSCINV
The numbers AMN and (2**(64/NPK)-1) / (AMX-AMN) are stored in array elements APN(1) and APN(2), followed by the integers (INT(I),I=1,NRN), packed
NPK per word in APN(3) ,..., APN(2+(NRN+NPK-1)/NPK).
Here's how routine UNPKAF works: The original
array ARN is approximated from the contents of the packed
array APN by means of the following formula:
ARN(I) = AMN + INTG(I) * (AMX-AMN) / (2**(64/NPK)-1)
Source code
You can obtain Fortran source code via anonymous FTP from ftp.ucar.edu by following this FTP session after logging on:
ftp> cd dsl/lib/ncaro
ftp> get packf.f
ftp> quit
A CAL version of the NCAR packing and unpacking routines is also available, but I do not recommend using it because CAL is more difficult to maintain over changes to the Cray programming environment.
If you think you need the CAL version, however, please contact me via e-mail (valent@ucar.edu).
References
None known, other than this article and comment lines in source code.
Questions and concerns
If you have questions about these routines on NCAR's CRAY computers, please contact me via e-mail
(valent@ucar.edu).
|