Galaxy is a widely adopted workflow management system for bioinformatics, aiming to make computational biology accessible to research scientists that do not have computer programming or systems administration experience. How can scientists connect this useful, reproducibility-oriented tool seamlessly with many data sources? How can they do so in a coherent way using different instances of Galaxy? Can they run it locally or on a secured infrastructure that handles patient data? Can they compare the results of those different scenarios? The proposed poster presents the work done as part of one of the EOSC-Pillar project’s scientific use-cases to address those questions, achieving the following objectives: ● Allow access to reference data from different Galaxy deployments to all EOSC users. ● Facilitate the deployment of Galaxy instances in the same infrastructure hosting the data to analyse ● Provide coherency in the deployment of different Galaxy instances ● Ensure sensitive (e.g., health) data security requirements are met throughout the process The poster describes four scientific scenarios based on concrete needs from the ELIXIR community. It also describes the technical services the use-case is relying on, namely: Laniakea (Galaxy as a service provided by IBIOM-CNR and INFN), Inserm data repository, IFB cloud Galaxy instances and the INDIGO-IAM authentication service provided by INFN. It demonstrates the interest of EOSC Pillar’s Federated Data Space (F2DS) for connecting different data sources to the Galaxy in a simple and coherent way. The poster also highlights the need to conform to data protection regulations concerning health personal data, by deploying Galaxy in a private, secured environment while still ensuring the data analysis workflow remains similar to its public counterpart. Finally, it shows proposed solutions to provide access to the service to all users within the EOSC community through roles management and by integrating it into a global authentication framework.

Exploring reference data through existing computing services for the bioinformatics community / Y. Sanaa, M. Belhaj Salem, G. Mathieu, C. Blanchet, P. Mandreoli, M. Tangaro, G. Donvito, N. Foggetti, M. Antonacci, L. Burlot, J. Lorenzo, D. Colombo, F. Zambelli, D. Salgado, C. Béroud. ((Intervento presentato al convegno EGI Conference 2022 tenutosi a Praga nel 2022.

Exploring reference data through existing computing services for the bioinformatics community

P. Mandreoli;F. Zambelli;
2022

Abstract

Galaxy is a widely adopted workflow management system for bioinformatics, aiming to make computational biology accessible to research scientists that do not have computer programming or systems administration experience. How can scientists connect this useful, reproducibility-oriented tool seamlessly with many data sources? How can they do so in a coherent way using different instances of Galaxy? Can they run it locally or on a secured infrastructure that handles patient data? Can they compare the results of those different scenarios? The proposed poster presents the work done as part of one of the EOSC-Pillar project’s scientific use-cases to address those questions, achieving the following objectives: ● Allow access to reference data from different Galaxy deployments to all EOSC users. ● Facilitate the deployment of Galaxy instances in the same infrastructure hosting the data to analyse ● Provide coherency in the deployment of different Galaxy instances ● Ensure sensitive (e.g., health) data security requirements are met throughout the process The poster describes four scientific scenarios based on concrete needs from the ELIXIR community. It also describes the technical services the use-case is relying on, namely: Laniakea (Galaxy as a service provided by IBIOM-CNR and INFN), Inserm data repository, IFB cloud Galaxy instances and the INDIGO-IAM authentication service provided by INFN. It demonstrates the interest of EOSC Pillar’s Federated Data Space (F2DS) for connecting different data sources to the Galaxy in a simple and coherent way. The poster also highlights the need to conform to data protection regulations concerning health personal data, by deploying Galaxy in a private, secured environment while still ensuring the data analysis workflow remains similar to its public counterpart. Finally, it shows proposed solutions to provide access to the service to all users within the EOSC community through roles management and by integrating it into a global authentication framework.
19-set-2022
Settore BIO/11 - Biologia Molecolare
https://indico.egi.eu/event/5882/contributions/16706/
Exploring reference data through existing computing services for the bioinformatics community / Y. Sanaa, M. Belhaj Salem, G. Mathieu, C. Blanchet, P. Mandreoli, M. Tangaro, G. Donvito, N. Foggetti, M. Antonacci, L. Burlot, J. Lorenzo, D. Colombo, F. Zambelli, D. Salgado, C. Béroud. ((Intervento presentato al convegno EGI Conference 2022 tenutosi a Praga nel 2022.
Conference Object
File in questo prodotto:
File Dimensione Formato  
EOSC-Pillar.pdf

accesso aperto

Descrizione: Poster
Tipologia: Altro
Dimensione 3.66 MB
Formato Adobe PDF
3.66 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/983148
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact