Massively parallel computing, using a message passing interface (MPI), has been implemented into a three-dimensional version of the Goddard Cumulus Ensemble (GCE) model. The implementation uses the domainresemble concept to design a code structure for both the whole domain and sub-domains after decomposition. Instead of inserting a group of MPI related statements into the model routine, these statements are packed into a single routine. In other words, only a single call statement to the model code is utilized once in a place, thus there is minimal impact on the original code. Therefore, the model is easily modified and/or managed by the model developers and/or users, who have little knowledge of massively parallel computing.
The model decomposition is highly flexible such that the entire model domain can be sliced into any number of partial domains in one- or twodimensional decomposition. Data exchange is through a halo-region, which is overlaid with neighboring partial domains. A halo-region is also known as a ghost-cell region. For reproducibility purposes, transposing data among tasks into different decompositions is required, such as Fourier transform and full domain summation.
The well-behaved performance of the implemented codes with anelastic and compressible versions on three different computing platforms indicates a successful implementation. The parallelization of both versions has speedup of 99% for up to 256 tasks. The anelastic version has better speedup and efficiency because its numerical algorithm is preferred by the parallelization than that of the compressible version.