Presentation
Optimizing Dataflow Pipelines from Self-Driving Labs to the Cloud
Presenter
DescriptionThe rapid advancements in cloud computing and the integration of experimental facilities, including self-driving labs, have resulted in an era where scientists can generate unprecedented amounts of data and conduct more extensive analyses across various scientific domains, including chemistry, materials sciences, molecular biology, and drug design. This capability enables a broader exploration of natural phenomena but also introduces significant challenges in effectively composing and scaling dataflow pipelines. This talk addresses these challenges by presenting innovative solutions for optimizing dataflow pipelines across cloud resources, thereby enhancing the study and application of scientific dataflows.
This talk will cover three main research components of our work when optimizing dataflow pipelines from self-driving labs to the cloud. First, we establish a taxonomy of common dataflow motifs ranging from simple producer-consumer pairs to complex multi-scale pipelines, applying these motifs to real-world use cases. Second, we discuss methods to mitigate data loss and pipeline inefficiencies, especially those arising from disparities in moving pipelines traditionally executed on high performance computing systems to the cloud. Last, we highlight our efforts to train and build a community of experts, emphasizing the development of tailored data analytics material across scientific domains.
This talk will cover three main research components of our work when optimizing dataflow pipelines from self-driving labs to the cloud. First, we establish a taxonomy of common dataflow motifs ranging from simple producer-consumer pairs to complex multi-scale pipelines, applying these motifs to real-world use cases. Second, we discuss methods to mitigate data loss and pipeline inefficiencies, especially those arising from disparities in moving pipelines traditionally executed on high performance computing systems to the cloud. Last, we highlight our efforts to train and build a community of experts, emphasizing the development of tailored data analytics material across scientific domains.
TimeMonday, June 315:30 - 16:00 CEST
LocationHG E 1.1
Event Type
Minisymposium
Chemistry and Materials
Climate, Weather, and Earth Sciences
Engineering
Life Sciences
Physics
Computational Methods and Applied Mathematics