Presentation

Hybrid Multi-GPU Distributed Octrees Construction for Massively Parallel Code Coupling Applications
DescriptionThis paper presents two new hybrid MPI-GPU algorithms for building distributed octrees. The first algorithm redistributes data between processes and is used to globally sort the points on which the octree is generated, according to their SFC codes. The second algorithm proposes a bottom-up approach to merge leaves from the maximum depth to their final level, ensuring that each leaf contains no more than Nmax points. This method is better suited for GPU implementation because it maximises parallelism from the beginning of the algorithm. The methods have been implemented in the CWIPI library to reduce the execution time of the point-in-mesh location algorithm, which is performed several times when moving non-coincident meshes are used. Tests on large cases have shown speedups of up to x120 compared to a conventional CPU version, with scaling as good as the full CPU version.
TimeTuesday, June 414:30 - 15:00 CEST
LocationHG E 1.2
Event Type
Paper
Domains
Computational Methods and Applied Mathematics