BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Europe/Stockholm
X-LIC-LOCATION:Europe/Stockholm
BEGIN:DAYLIGHT
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
TZNAME:CEST
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=-1SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
TZNAME:CET
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=-1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20241120T082410Z
LOCATION:HG E 1.2
DTSTART;TZID=Europe/Stockholm:20240604T143000
DTEND;TZID=Europe/Stockholm:20240604T150000
UID:submissions.pasc-conference.org_PASC24_sess181_pap123@linklings.com
SUMMARY:Hybrid Multi-GPU Distributed Octrees Construction for Massively Pa
 rallel Code Coupling Applications
DESCRIPTION:Paper\n\nRobin Cazalbou (ONERA); Florent Duchaine (CERFACS); a
 nd Eric Quémerais, Bastien Andrieu, Gabriel Staffelbach, and Bruno Maugars
  (ONERA)\n\nThis paper presents two new hybrid MPI-GPU algorithms for buil
 ding distributed octrees. The first algorithm redistributes data between p
 rocesses and is used to globally sort the points on which the octree is ge
 nerated, according to their SFC codes. The second algorithm proposes a bot
 tom-up approach to merge leaves from the maximum depth to their final leve
 l, ensuring that each leaf contains no more than Nmax points. This method 
 is better suited for GPU implementation because it maximises parallelism f
 rom the beginning of the algorithm. The methods have been implemented in t
 he CWIPI library to reduce the execution time of the point-in-mesh locatio
 n algorithm, which is performed several times when moving non-coincident m
 eshes are used. Tests on large cases have shown speedups of up to x120 com
 pared to a conventional CPU version, with scaling as good as the full CPU 
 version.\n\nDomain: Computational Methods and Applied Mathematics\n\nSessi
 on Chair: Fazeleh Kazemian (Australian National University)
END:VEVENT
END:VCALENDAR
