BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Europe/Stockholm
X-LIC-LOCATION:Europe/Stockholm
BEGIN:DAYLIGHT
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
TZNAME:CEST
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=-1SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
TZNAME:CET
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=-1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20241120T082409Z
LOCATION:HG E 3
DTSTART;TZID=Europe/Stockholm:20240605T103000
DTEND;TZID=Europe/Stockholm:20240605T110000
UID:submissions.pasc-conference.org_PASC24_sess131_msa234@linklings.com
SUMMARY:Multi-GPU Optimization of a Large-Scale Cortical Model of Human-Li
 ke Gaze Behaviour
DESCRIPTION:Minisymposium\n\nVaishnavi Narayanan (Maastricht University), 
 Samuel Omlin (ETH Zurich / CSCS), and Mario Senden (Maastricht University)
 \n\nWe introduce a large-scale biophysical model for dynamic visual target
  selection, mimicking human gaze behavior, optimized using Julia programmi
 ng on multiple GPUs. Our dynamic mean-field model sequentially generates v
 isual targets, accommodating network sizes up to 25600 neural populations,
   with connectivity matrices reaching up to 25600x25600 neural connections
 , totaling over 600 million connections. To achieve human-like behavior, w
 e employ Bayesian optimization for parameter tuning, enabling efficient op
 timization through iterative updates of a probabilistic surrogate model. T
 his enables the model to generate temporally-accurate visual targets to re
 levant scene locations. Optimization procedures are executed in parallel o
 n 96 instances of the network via GPU supercomputing, simulating over 60 b
 illion neural connections. One iteration of optimizing the largest model t
 akes 70 seconds using 96 GPUs with 99% parallel efficiency. Implementation
  relies on Julia programming, accessing highly optimized vendor libraries 
 for matrix-vector operations and fast Fourier transformations (CUBLAS and 
 CUFFT for Nvidia GPUs), and utilizing ParallelStencil.jl for stencil compu
 tations. MPI enables distributed memory parallelization without communicat
 ion during function evaluation. We unify the codebase using  ParallelStenc
 i.jl  to enable both single CPU prototyping and large-scale GPU or CPU run
 s. This multi-GPU application achieves near-optimal performance and scales
  efficiently to thousands of NVIDIA Tesla P100 GPUs at CSCS.\n\nDomain: Cl
 imate, Weather, and Earth Sciences, Physics, Computational Methods and App
 lied Mathematics\n\nSession Chairs: Samuel Omlin (ETH Zurich / CSCS); Ludo
 vic Raess (University of Lausanne, ETH Zurich); and Michael Schlottke-Lake
 mper (University of Augsburg)
END:VEVENT
END:VCALENDAR
