Implementation of Parallel Least-Squares Alorithms for Gravity Field Estimation

Loading...
Thumbnail Image

Date

2005-03

Journal Title

Journal ISSN

Volume Title

Publisher

Ohio State University. Division of Geodetic Science

Research Projects

Organizational Units

Journal Issue

Abstract

NASA/GFZ’s Gravity Recovery and Climate Experiment (GRACE) twin-satellite mission, launched in 2002 for a five-year nominal mission, has provided accurate scientific products which help scientists gain new insights on climate signals which manifest as temporal variations of the Earth’s gravity field. This satellite mission also presents a significant computational challenge to analyze the large amount of data collected to solve a massive geophysical inverse problem every month. This paper focuses on applying parallel (primarily distributed) computing techniques capable of rigorously inverting monthly geopotential coefficients using GRACE data. The gravity solution is based on the energy conservation approach which established a linear relationship between the in-situ geopotential difference of two satellites and the position and velocity vectors using the high-low (GPS to GRACE) and the low-low (GRACE spacecrafts) satellite-to-satellite tracking data, and the accelerometer data from both GRACE satellites. Both the direct or rigorous inversion and the iterative (conjugate gradient) methods are studied. Our goal is to develop numerical algorithms and a portable distributed-computing code, which is potentially “scalable” (i.e., keeping constant efficiency with increased problem size and number of processors), capable of efficiently solving the GRACE problem and also applicable to other generalized large geophysical inverse problems. Typical monthly GRACE gravity solutions require solving spherical harmonic coefficients complete to degree 120 (14,637 parameters) and other nuisance parameters. The accumulation of the 259,200 monthly low-low GRACE observations (with 0.1 Hz sampling rate) to normal equations matrix needs more than 55 trillion floating point operations (FLOPs) and ~1.7 GB central memory to store it. Its inversion adds ~1 trillion FLOPs. To circumvent this huge computational challenge, we use a 16 nodes SGI 750 cluster system with 32 733 MHz Itanium processors to test our algorithm. We choose the object-oriented Parallel Linear Algebra Package (PLAPACK) as the main tool and Message Passing Interface (MPI) as the underlying communication layer to build the parallel code. MPI parallel I/O technique is also implemented to increase the speed of transferring data between hard drive and memory. Furthermore, we optimize both the serial and parallel codes by carefully analyzing the cost of the numerical operations, fully exploiting the power of the Itanium architecture and utilizing highly optimized numerical libraries. For direct inversion, we tested the implementations of the Normal equations Matrix Accumulation (NMA) method, that computes the design as well as normal iii equations matrix locally and accumulates them to global objects afterwards, and the Design Matrix Accumulation (DMA) approach, which forms small-size design matrices locally first and transfers them to global scale by matrix-matrix multiplication to obtain a global normal equations matrix. The creation of the normal equations matrix takes the majority of the entire wall clock time. Our preliminary results indicate that the NMA method is very fast but at present cannot be used to estimate extremely high degree and order coefficients due to the lack of central memory. The DMA method can solve for all geopotential coefficients complete to spherical harmonic degree 120 in roughly 30 minutes using 24 CPUs. The serial implementation of the direct inverse method takes about 7.5 hours for the same inversion problem using the same but only one processor. In the realization of the conjugate gradient method on the distributed platform, the preconditioner is chosen as the block diagonal part of the normal equations matrix. The approximate computation of the variance-covariance matrix for the solution is also implemented. With significantly less arithmetic operations and memory usage, the conjugate gradient method only spends approximately 8 minutes (wall clock time) to solve for the gravity field coefficients up to degree 120 using 24 CPUs after 21 iterations, while the serial code runs roughly 3.5 hours to achieve the same results on a single processor. Both the direct inversion method and the iterative method give good estimates of the unknown geopotential coefficients. In this sense, the iteration approach is better for the much shorter running time, but only an approximation of the estimated variancecovariance matrix is provided. Scalability of the direct and iterative method is also analyzed in this study. Numerical results show that the NMA method and the conjugate gradient method achieve good scalability in our simulation. While the DMA method is not as scalable as the other two for smaller problem sizes, its efficiency improves gradually with the increase of problem sizes and processor numbers. The developed codes are potentially transportable across different computer platforms and applicable to other generalized large geophysical inverse problems.

Description

This report was prepared by Jing Xie, a graduate research associate in the Department of Civil and Environmental Engineering and Geodetic Science at the Ohio State University, under the supervision of Professor C. K. Shum. This research was partially supported by grants from NSF Earth Sciences program: EAR-0327633, NASA Office of Earth Science program: NNG04GF01G and NNG04GN19G.
This report was also submitted to the Graduate School of the Ohio State University as a thesis in partial fulfillment of the requirements for the Master of Science degree.

Keywords

Citation