High Performance Computing: algorithms and numerical methods.
Vacany tracking cycle algorthm for optimal in-place multidimensional array remapping,
as used in climate models, FFTs, Alternating Direction Implicit (ADI) methods, etc.
An Optimal Index Reshuffle Algorithm for Multidimensional Arrays
and Its Applications for Parallel Architectures.
IEEE Transactions on Parallel and Distributed Systems,
Vol. 12, No.3, March 2001, pp.306-315.
In-place multi-dimensional array index reshuffle is an open problem in
For 1D array
the problem is solved by Don Fraser (1976)
for power-of-2 array sizes.
This paper describes how to generate
vacancy tracking cycles and use them for in-place index reshuffle
that works for arbitrary dimensions with
arbitrary sizes. It eliminates the commonly used auxiliary array and
is optimal in memory access.
The algorithm was first developed in 1993 while I'm
working on parallel FFTs in JPL and was presented in 7th
SIAM Parallel Processing conference (Feb 1995).
Implemented a parallel
Recursive Inertial Bisection algorithm using MPI.
This is used to partition the dual graph generated
in a finite element mesh.
An Element-Based Concurrent Partitioner for Unstructured Finite Element
Meshes. C.H.Q.Ding and F.D.Ferraro, in Proceedings of 10th
IEEE International Parallel Processing Symposium, pp.601-605. 1996.
Parallel Sparse Matrix Linear Solvers:
Developed a home-grown parallel solver suite, including a BiCG solver,
a Cholesky factorization solver, and a hybrid method based on
Schur's complement integrating Cholesky with BiCG. The following
paper describes the solver suite.
A General Purpose Sparse Matrix Parallel Solvers Package,
C.H.Q.Ding and F.D.Ferraro, in Proceedings of 9th IEEE International
Parallel Processing Symposium, pp.70-76. 1995.
(This and the graph partitioner of previous paper were used for
a Kalman Filter problem used in climate data assimilation package.)
Use Aztec sparse solver package on a nonlinear flow equation:
Computations for Large Scale Simulations of Subsurface Multiphase Fluid
and Heat Flow. E. Elmroth, C.H.Q. Ding and Y.-S. Wu.
Proc. Supercomputing'99. Nov 1999.
An expanded version to appear in Journal of Supercomputing,
v.18, No.3, 2001.
Parallel Data Organization and I/O on distributied memory computers
Finite Element Analysis. Finite volumne method.
Paralle 1d, 2d, 3d FFT's
High Performance Fortran
MPI and OpenMP on clusters of SMPs