DaFab invests in cloud-computing techniques and public metadata catalogs to provide a unified solution for searching raw Copernicus and by-products by features and relationships. By addressing these challenges, DaFab aims to unlock the full potential of Copernicus data and drive growth in the European EO data market.
Recent News and Events
Harnessing the Power of Rucio for the DaFab Project: A Leap Towards Advanced Metadata Management
Introduction
In the realm of both scientific research and production environments, efficiently managing and utilizing metadata is crucial. Metadata serves as the backbone for data discovery, organization, and retrieval, enabling effective data usage across various fields. This is particularly important in areas like Earth Observation (EO), where vast amounts of satellite data need to be processed and analysed to monitor and understand our planet.
The DaFab project, an ambitious initiative, aims to enhance the exploitation of Copernicus data through advanced AI and High-Performance Computing (HPC) technologies. By integrating these technologies, DaFab seeks to improve the timeliness, accuracy, and accessibility of EO data. At the heart of this endeavour lies Rucio, a robust data management system developed by CERN. Rucio’s role is pivotal in achieving key objectives of the project such as creating a unified, searchable catalogue of interlinked EO metadata, improving metadata ingestion and retrieval speeds, and facilitating seamless integration with AI-driven workflows and HPC systems.
Orchestration of Workflows in converged Cloud and HPC environments
Multi-Site Workflow Orchestration in the DAFAB Project
DAFAB will design and implement a workflow orchestration system that enhances multi-site application deployment and data discovery. The workflow system will enable applications to express their computations as a graph and declare the data needed at each step in a high-level query language. Workflows will then execute across multiple sites, whether cloud-based (Kubernetes) or high-performance computing (Slurm) environments. By enabling transparent data access and seamless execution of workflow stages, this system shifts the burden of application deployment and data discovery to the platform itself, significantly accelerating development timelines.
Job offer at CERN
CERN is offering a position in the Rucio development team at CERN. Rucio is an open-source scientific data management system responsible to manage the data of some of the biggest scientific data producers in the world. Experiments such as ATLAS, CMS, Belle II, DUNE, and many others rely on Rucio, which manages world-wide distributed data in the exa-byte range.
More information at: https://jobs.smartrecruiters.com/CERN/743999972065373-software-engineer-in-distributed-systems-ep-adp-co-2024-34-grap