Implementation of Parallel Least-Squares Alorithms for Gravity Field Estimation
Loading...
Date
2005-03
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Ohio State University. Division of Geodetic Science
Abstract
NASA/GFZ’s Gravity Recovery and Climate Experiment (GRACE) twin-satellite
mission, launched in 2002 for a five-year nominal mission, has provided accurate
scientific products which help scientists gain new insights on climate signals which
manifest as temporal variations of the Earth’s gravity field. This satellite mission also
presents a significant computational challenge to analyze the large amount of data
collected to solve a massive geophysical inverse problem every month. This paper
focuses on applying parallel (primarily distributed) computing techniques capable of
rigorously inverting monthly geopotential coefficients using GRACE data. The gravity
solution is based on the energy conservation approach which established a linear
relationship between the in-situ geopotential difference of two satellites and the position
and velocity vectors using the high-low (GPS to GRACE) and the low-low (GRACE
spacecrafts) satellite-to-satellite tracking data, and the accelerometer data from both
GRACE satellites. Both the direct or rigorous inversion and the iterative (conjugate
gradient) methods are studied. Our goal is to develop numerical algorithms and a portable
distributed-computing code, which is potentially “scalable” (i.e., keeping constant
efficiency with increased problem size and number of processors), capable of efficiently
solving the GRACE problem and also applicable to other generalized large geophysical
inverse problems.
Typical monthly GRACE gravity solutions require solving spherical harmonic
coefficients complete to degree 120 (14,637 parameters) and other nuisance parameters.
The accumulation of the 259,200 monthly low-low GRACE observations (with 0.1 Hz
sampling rate) to normal equations matrix needs more than 55 trillion floating point
operations (FLOPs) and ~1.7 GB central memory to store it. Its inversion adds ~1 trillion
FLOPs. To circumvent this huge computational challenge, we use a 16 nodes SGI 750
cluster system with 32 733 MHz Itanium processors to test our algorithm. We choose the
object-oriented Parallel Linear Algebra Package (PLAPACK) as the main tool and
Message Passing Interface (MPI) as the underlying communication layer to build the
parallel code. MPI parallel I/O technique is also implemented to increase the speed of
transferring data between hard drive and memory. Furthermore, we optimize both the
serial and parallel codes by carefully analyzing the cost of the numerical operations, fully
exploiting the power of the Itanium architecture and utilizing highly optimized numerical
libraries.
For direct inversion, we tested the implementations of the Normal equations
Matrix Accumulation (NMA) method, that computes the design as well as normal
iii
equations matrix locally and accumulates them to global objects afterwards, and the
Design Matrix Accumulation (DMA) approach, which forms small-size design matrices
locally first and transfers them to global scale by matrix-matrix multiplication to obtain a
global normal equations matrix. The creation of the normal equations matrix takes the
majority of the entire wall clock time. Our preliminary results indicate that the NMA
method is very fast but at present cannot be used to estimate extremely high degree and
order coefficients due to the lack of central memory. The DMA method can solve for all
geopotential coefficients complete to spherical harmonic degree 120 in roughly 30
minutes using 24 CPUs. The serial implementation of the direct inverse method takes
about 7.5 hours for the same inversion problem using the same but only one processor.
In the realization of the conjugate gradient method on the distributed platform, the
preconditioner is chosen as the block diagonal part of the normal equations matrix. The
approximate computation of the variance-covariance matrix for the solution is also
implemented. With significantly less arithmetic operations and memory usage, the
conjugate gradient method only spends approximately 8 minutes (wall clock time) to
solve for the gravity field coefficients up to degree 120 using 24 CPUs after 21 iterations,
while the serial code runs roughly 3.5 hours to achieve the same results on a single
processor.
Both the direct inversion method and the iterative method give good estimates of
the unknown geopotential coefficients. In this sense, the iteration approach is better for
the much shorter running time, but only an approximation of the estimated variancecovariance
matrix is provided. Scalability of the direct and iterative method is also
analyzed in this study. Numerical results show that the NMA method and the conjugate
gradient method achieve good scalability in our simulation. While the DMA method is
not as scalable as the other two for smaller problem sizes, its efficiency improves
gradually with the increase of problem sizes and processor numbers. The developed
codes are potentially transportable across different computer platforms and applicable to
other generalized large geophysical inverse problems.
Description
This report was prepared by Jing Xie, a graduate research associate in the Department of Civil and Environmental Engineering and Geodetic Science at the Ohio State University, under the supervision of Professor C. K. Shum. This research was partially supported by grants from NSF Earth Sciences program: EAR-0327633, NASA Office of Earth Science program: NNG04GF01G and NNG04GN19G.
This report was also submitted to the Graduate School of the Ohio State University as a thesis in partial fulfillment of the requirements for the Master of Science degree.
This report was also submitted to the Graduate School of the Ohio State University as a thesis in partial fulfillment of the requirements for the Master of Science degree.