ACM Posters – Student Research Competition

P12 / ACMP01 – DeCovarT, a Multidimensional Probalistic Model for the Deconvolution of Heterogeneous Transcriptomic Samples

Although bulk transcriptomic analyses have greatly contributed to a better understanding of complex diseases, their sensibility is hampered by the highly heterogeneous cellular compositions of biological samples. To address this limitation, computational deconvolution methods have been designed to automatically estimate the frequencies of the cellular components that make up tissues, typically using reference samples of physically purified populations. However, they perform badly at differentiating closely related cell populations. We hypothesized that the integration of the covariance matrices of the reference samples could improve the performance of deconvolution algorithms. We therefore developed a new tool, DeCovarT, that integrates the structure of individual cellular transcriptomic network to reconstruct the bulk profile. Specifically, we inferred the ratios of the mixture components by a standard maximum likelihood estimation (MLE) method, using the Levenberg-Marquardt algorithm to recover the maximum from the parametric convolutional distribution of our model. We then considered a reparametrization of the log-likelihood to explicitly incorporate the simplex constraint on the ratios. Preliminary numerical simulations suggest that this new algorithm outperforms previously published methods, particularly when individual cellular transcriptomic profiles strongly overlap.

Author(s): Bastien Chassagnol (University of Paris VI, ardata), Grégory Nuel (University of Paris VI), and Etienne Becht (INSERM)

ACMP02 – Demographic Aware Hyperparameter Optimization for Cancer

The preliminary work presented here evaluates the robustness and bias of ML models by probing HPO behavior under different demographic conditions, which is critical for the development of clinically-usable methods. By examining the different hyperparameter distributions for a transformer based model along a male-female data split, we gain insight into the behavior and transferability of hyperparameters along imbalanced datasets in this area.

Author(s): Rylie Weaver (Argonne National Laboratory, Claremont Graduate University)

P44 / ACMP03 – Scalable Simulations of Resistive Memory Devices: A Dynamical Monte Carlo Approach

Resistive random access memories (ReRAM) are expected to play a prominent role in modern computer architectures due to their low cost, simple structure, and unique functionality. The long-range atomic movements inside these devices, which occur over extended timescales under applied fields, can be accurately described by Dynamical Monte Carlo (DMC) simulations. In DMC, the continuum movements of atoms are discretized into ‘events’ on an atomistic graph, which is time-stepped under the influence of external fields (potential, Joule heating). Parallelization can only occur within each step, rendering such simulations highly sensitive to data movement. Here, we present a scalable DMC code that simultaneously optimizes the different computational kernels found in the field solvers (systems of linear equations, matrix-vector multiplication) and event selection (prefix sums). Our implementation leverages preconditioned sparse iterative solvers, graph-based domain decomposition to divide work between nodes, and hybrid CPU-GPU computations to optimize node usage and data transfer in a distributed environment. The acceleration ultimately enables the first investigation of ReRAM crossbar arrays with an atomistic resolution, providing deeper insights into the operating mechanisms of these devices and paving the way for their mainstream adaptation in future memory technologies.

Author(s): Alexander Maeder (ETH Zurich), Manasa Kaniselvan (ETH Zurich), Marko Mladenović (ETH Zurich), Mathieu Luisier (ETH Zurich), and Alexandros Nikolaos Ziogas (ETH Zurich)

ACMP04 – Efficient Compression for Weather and Climate Data

Weather and climate simulations produce petabytes of high-resolution data that are later analyzed by researchers to understand climate change or severe weather. We propose an efficient data compression method dedicated to weather and climate data. It consists of two components: one base compressor utilizing the JPEG2000 image codec, and a residual compressor based on sparse wavelets. With compression ratios ranging from 10x to more than 3,000x, our method can achieve better accuracy than existing methods. It can faithfully preserve important large-scale atmosphere structures and does not introduce significant artifacts. Due to the introduction of the residual compressor, it can achieve error-bounded compression without sacrificing compression ratios.

Author(s): Langwen Huang (ETH Zurich), and Torsten Hoefler (ETH Zurich)

ACMP05 – Efficient, Portable, Massively Parallel Free-Space Solvers for the Poisson Equation

Vico et al. (2016) suggest a fast algorithm for computing volume potentials which is of benefit to the beam and plasma physics communities, as they require the solution of Poisson’s equation with free-space boundary conditions. The standard method to solve the free-space Poisson equation is to use the algorithm presented by Hockney and Eastwood (1988), which is second order in convergence at best. The algorithm proposed by Vico et al., which we refer to as Vico-Greengard, converges spectrally, i.e. faster than any fixed order of the number of grid points, for smooth enough functions. We implement a performance portable Poisson solver in the framework of the IPPL (Independent Parallel Particle Layer) library based on these two methods: the traditional Hockney-Eastwood, and the novel Vico-Greengard. Furthermore, we suggest an improvement to the Vico-Greengard algorithm which reduces its memory footprint. We show that for sufficiently smooth distribution functions, the Vico-Greengard algorithm could be a good candidate for reducing memory usage, since better accuracy can be obtained with a coarser grid. This is especially significant for GPUs, which present memory constraints. Finally, we showcase performance through scaling studies on the Perlmutter (NERSC) supercomputer, with efficiencies staying above 50% in the strong scaling case.

Author(s): Sonali Mayani (Paul Scherrer Institute, ETH Zurich), Antoine Cerfon (New York University, Type One Energy), Matthias Frey (University of St Andrews), Veronica Montanaro (ETH Zurich), Sriramkrishnan Muralikrishnan (Forschungszentrum Jülich), and Andreas Adelmann (Paul Scherrer Institute, ETH Zurich)

ACMP06 – Finding Optimistic Upper Bounds for Task Graph Throughput on Heterogeneous Systems Using Linear Programming

In this extended abstract, we present a model — inspired by previous work in the data flow community — for finding optimistic upper bounds on the throughput of task graphs executed on heterogeneous systems. This model interprets the execution of such graphs as flow networks with additional resource constraints. We show that such flow networks can be optimised as linear programs, and we present a Python interface for the representation and the finding of solutions to such programs. Finally, we provide some brief examples of how such models can be used to describe the performance of existing task graph application, and how they can be used to guide the optimisation and development of future software.

Author(s): Stephen Nicholas Swatman (University of Amsterdam, CERN), and Ana-Lucia Varbanescu (University of Twente)

ACMP07 – High Performance Computing Derived Biological Multiplex Network Uncovers Distinct Pathways Underlying Opioid and Nicotine Addiction

Leveraging High-Performance Computing (HPC) for biological network generation, key insights into the genetic and epigenetic mechanisms supporting opioid and nicotine addition have been uncovered. Using distributed network generation software on the Frontier supercomputer, the authors processed 700 single-cell RNA sequencing (scRNAseq) data sets to construct biologically robust multiplex networks. By employing an MPI task farm as a scheduling method, the network generation software computed networks in 3 real-time hours compared to the average 76 day CPU-time, using iterative Random Forest (iRF) Leave One Out Prediction (LOOP). Network layers were validated using RWR with k Fold cross validation against independently curated GO terms and clustered using MENTOR, an algorithm developed for the clustering and visualization of RWR rank order embeddings. The findings highlight transcriptional regulation via transcription factors and epigenetic mechanisms implicated in neural development. This research not only illuminates our current understanding of nicotine and opioid addiction, but also demonstrates the importance of HPC network generation and validation techniques.

Author(s): Matthew Lane (University of Tennessee, Oak Ridge National Laboratory)

P36 / ACMP08 – Mixed-Precision in High-Order Methods: Studying the Impact of Lower Numerical Precisions on the ADER-DG Algorithm

We study the impact of using mixed and variable numerical precision in the high-order ADER-DG method for solving partial differential equations. The impact of precision on both the overall convergence order, as well as specific sections of the code are examined. This lets us judge how sensible each part of the algorithm is to small losses of precision. We also research how numerical precision affects the stability of ADER-DG by simulating two stationary but numerically challenging scenarios and check whether variable precision can resolve stability issues. Finally, we review the effects of numerical precision on the features Lagrange interpolations, which are commonly used but are susceptible to small changes in the nodal values.

Author(s): Marc Marot-Lassauzaie (Technical University of Munich), and Michael Bader (Technical University of Munich)

ACMP09 – On-Line Tracking of Tropical Cyclones in a Climate Model on a Given Radius Around its Path

We discuss the implementation of two added functionalities to the ICON weather and climate model: time dependent output regions for variables stored on grid cell centers and an on-line version of an existing off-line tracking algorithm for tropical cyclones (TCs).

Author(s): Thibault Meier (ETH Zurich)

ACMP10 – Optimized Finite Volume Methods Solver Allows for Real-Sized Tumor Simulations

Multi-scale agent-based cell simulators sets daunting computational challenges in bioinformatics, only feasible by supercomputing resources. These simulators consider evolving microenvironmental conditions and cell interactions. By specifying rules at the cell level, researchers can explore complex tissue and organ systems, aiming to create Human Digital Twins (HDT) for personalized medicine. While significant milestones have been achieved, current systems are not able to model HDTs, only reaching real-sized tissue simulations of the order of 10⁶ cells while organs simulations are of the order of 10¹². PhysiCell is a physics-based multi-scale cell simulator that facilitates the translation of intracellular mechanisms to tissue-level biomedical solutions. An analysis of PhysiCell and its distributed version, PhysiCell-X, reveals the diffusion time step as a critical bottleneck. BioFVM and BioFVM-X, using Finite Volume Methods, encounter scalability issues in modeling microenvironmental evolution. Enter BioFVM-B, a scalable library offering a lightweight data structure and an optimized Diffusion-decay 3D solver. BioFVM-B enables simulations of microenvironments that can contain real-sized tumors with reduced computing nodes and an efficient implementation for solving massive sets of tridiagonal equations that accelerates the Diffusion time-step with factors of up to ~200X.

Author(s): Jose Luis Estragués Muñoz (Barcelona Supercomputing Center), Carlos Álvarez (Barcelona Supercomputing Center), Daniel Jimenez (Barcelona Supercomputing Center), Alfonso Valencia (Barcelona Supercomputing Center, ICREA), and Arnau Montagud (Barcelona Supercomputing Center)

P39 / ACMP11 – Performance Regression Unit Testing for High Performance Computing Packages in Julia

This research focuses on the integration of performance testing into the unit testing phase for High-Performance Computing (HPC) software, emphasizing its importance in ensuring optimal implementations and diagnosing performance regressions. In traditional unit testing, functional aspects are assessed, but performance testing is often deferred to higher testing levels. The absence of performance testing at the unit level in HPC can lead to significant computational efficiency issues, challenging to diagnose and address. The project aims to develop a Julia package for a Performance Regression Unit Test Framework, seamlessly integrating with Julia’s ecosystem and specifically made to be easy to integrate in HPC projects. By employing modern software development practices, the framework seeks to create a user-friendly, interpretable, robust, and efficient performance testing infrastructure. The research’s significance lies in enhancing the reliability and performance of critical HPC software, promoting early detection and mitigation of performance issues, and advocating for best practices in software development for sustainability and maintainability in the Julia environment. Ultimately, the framework aims to bridge the gap between unit and performance testing in the HPC domain, contributing to improved software reliability, interpretability, and performance.

Author(s): Daniel Sergio Vega Rodríguez (Università della Svizzera italiana), Samuel Omlin (ETH Zurich / CSCS), Juraj Kardoš (Università della Svizzera italiana), and Olaf Schenk (Università della Svizzera italiana)