Efficient Algorithms and High Performance Computing

The goal of this subproject is to optimize the simulation codes used within ASCETE, focusing on the following three algorithmic challenges:

  • optimization of memory access and communication patterns of element-oriented DG method, as these allow stream-based access to unknowns with data exchange restricted to common faces;
  • dealing with load imbalances expected from adaptive mesh refinement in dynamic rupture simulation, tsunami propagation, as well as at the coupling interfaces;
  • using locality-preserving element orders based on space-filling-curves to improve cache efficiency and memory access for stream-based processing, parallelisation, and dynamic load balancing.

sam(oa)2 – space-filling-curves and adaptive meshes for oceanic (and other) applications

We will integrate the tsunami model developed by the group of Behrens with the space-filling-curve approach developed by Bader et al. – the two approaches are already compatible regarding mesh generation and parallelisation based on Sierpinski space-filling curves. In a first step, we will establish kernel routines for element-oriented processing that can be used in both environments. Our goal is to substantially increase the problem sizes that can be treated for simulation of tsunami propagation.

Locality-preserving storage and processing schemes for SeisSol

SeisSol uses an a-priori generated, unstructured adaptive tetrahedral mesh; an optimization of the data layout can thus be done as an offline process. We will introduce tetrahedral strips (face-connected sequences of tetrahedral elements) created via recursive substructuring of the computational domain, such that we obtain locality-preserving orders similar to space-filling curves. The new data layout will then be utilized for an optimization of SeisSol’s serial performance. We also intended to use the advantages of this data structure for improved parallel load balancing.

Efficient parallelization and load balancing

The varying time step sizes of the DG method in SeisSol lead to strongly varying workload on different processors, as not all elements require processing in each time step. Levelling these load imbalances would lead to a substantial performance gain of SeisSol. We will tackle this problem via fast and dynamic re-balancing of partitions using the locality-preserving orders. The recursive construction of such orders allows for the construction of hierarchical small sub-partitions, which can be quickly and dynamically scheduled to available cores.

In tsunami simulation, the propagating waves form a dynamically changing centre of adaptive refinement, which again leads to strong changes of the computational load during a simulation. Here, efficient parallelisation and load balancing based on Sierpinski space-filling curves will be applied.