BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Europe/Stockholm
X-LIC-LOCATION:Europe/Stockholm
BEGIN:DAYLIGHT
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
TZNAME:CEST
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=-1SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
TZNAME:CET
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=-1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20241120T082410Z
LOCATION:HG E 1.2
DTSTART;TZID=Europe/Stockholm:20240603T170000
DTEND;TZID=Europe/Stockholm:20240603T173000
UID:submissions.pasc-conference.org_PASC24_sess176_pap120@linklings.com
SUMMARY:Enabling Performance Portability for Shallow Water Equations on CP
 Us, GPUs, and FPGAs with SYCL
DESCRIPTION:Paper\n\nMarkus Büttner (University of Bayreuth); Christoph Al
 t (Paderborn University, Friedrich-Alexander-Universität Erlangen-Nürnberg
 ); Tobias Kenter (Paderborn University); Harald Köstler (Friedrich-Alexand
 er-Universität Erlangen-Nürnberg); Christian Plessl (Paderborn University)
 ; and Vadym Aizinger (University of Bayreuth)\n\nIn order to make the best
  use of the diverse hardware architectures in present and future high-perf
 ormance computers, developers and maintainers of scientific simulation cod
 es strive for performance portability. The goal is to reach a good fractio
 n of the hardware-specific practically achievable performance while mainta
 ining a largely unified codebase. In benchmarks and first production codes
 , SYCL has been demonstrated to be a promising programming model for this 
 purpose when targeting different CPU and GPUs. In this work, we utilize SY
 CL to develop a performance portable implementation of the 2D shallow wate
 r equations, discretized on unstructured triangular meshes using the disco
 ntinuous Galerkin method with polynomial orders zero, one, and two. In add
 ition to GPUs from three and CPUs from two vendors, we also broaden the sc
 ope of target architectures by including Intel Stratix FPGAs with a fundam
 entally different execution model. We show that with a few targeted and en
 capsulated specializations, it is possible to adapt the execution flow to 
 the respective targets. The performance analysis shows how FPGAs complemen
 t the other two architectures with particularly good performance for small
  problem sizes.\n\nDomain: Computational Methods and Applied Mathematics\n
 \nSession Chair: Jamil Gafur (The University of Iowa, National Renewable E
 nergy Laboratory)
END:VEVENT
END:VCALENDAR