Motivation As nucleic acid sequencing technologies become increasingly accessible, their applications expand beyond research into various domains, including healthcare. Personalized medicine and pharmacogenomics hold the promise of revolutionizing medical treatments for a wide range of pathologies, such as cancer and genetic diseases. However, to fully exploit this potential numerous technical, legal, and ethical challenges should be addressed. The demand for efficient solutions in secure handling of human genetic data is very high and requires the development of ready-to-use and cost-effective services which can be efficiently provided by by public infrastructures such as ELIXIR-IT, the Italian node of the European Research Infrastructure for Life Science Data. Here, we describe the architecture of a VM-based service integrated into a broader computational environment designed for managing human genetic data from production to deposition in access-controlled repositories, to be based in the ReCaS datacenter in Bari, Italy. Methods Data is transferred via SSH protocol to a secure storage facility, we named BioRepository, providing data-at-rest encryption and geo-redundant storage. A virtualized computational environment is deployed through a cloud infrastructure, offering state-of-the-art bioinformatics tools for data analysis. Software tools are accessible through package managers and/or containerized for compatibility, reproducibility, and ease of updates. Workflow management systems streamline the analysis process. IT automation engines facilitate software installation, customization, and maintenance. Upon completion of the analysis, users gain access to downstream services for data FAIRification, deposition, and discoverability. These services will include a federated node of the EGA human genome-phenome archive (FEGA) for metadata browsing and data access control. Additionally, a service based on the Beacon protocol enables the discoverability of datasets hosted by the FEGA node. Results This integrated approach represents a significant leap forward in managing human genetic data infrastructure in Italy, providing a resource-efficient, easily maintainable, and scalable solution tailored for both research and healthcare applications. By combining secure data transfer mechanisms, state-of-the-art storage facilities, and a versatile computational environment, this system ensures the efficient handling of genetic data while upholding high standards of security and accessibility.

Development of a state of the art computational environment for handling human genetic data : the effort of ELIXIR-IT / C. Lo Giudice, F. Licciulli, G. Miniello, M. Moscatelli, S.N. Cox, A.S. Varvara, B. Fosso, M.A. Tangaro, R. Cilli, D. Traversa, G. Donvito, E. Capriotti, M. Chiara, F. Zambelli, G. Pesole. ((Intervento presentato al convegno BITS Annual Meeting tenutosi a Trento nel 2024.

Development of a state of the art computational environment for handling human genetic data : the effort of ELIXIR-IT

D. Traversa;M. Chiara;F. Zambelli;
2024

Abstract

Motivation As nucleic acid sequencing technologies become increasingly accessible, their applications expand beyond research into various domains, including healthcare. Personalized medicine and pharmacogenomics hold the promise of revolutionizing medical treatments for a wide range of pathologies, such as cancer and genetic diseases. However, to fully exploit this potential numerous technical, legal, and ethical challenges should be addressed. The demand for efficient solutions in secure handling of human genetic data is very high and requires the development of ready-to-use and cost-effective services which can be efficiently provided by by public infrastructures such as ELIXIR-IT, the Italian node of the European Research Infrastructure for Life Science Data. Here, we describe the architecture of a VM-based service integrated into a broader computational environment designed for managing human genetic data from production to deposition in access-controlled repositories, to be based in the ReCaS datacenter in Bari, Italy. Methods Data is transferred via SSH protocol to a secure storage facility, we named BioRepository, providing data-at-rest encryption and geo-redundant storage. A virtualized computational environment is deployed through a cloud infrastructure, offering state-of-the-art bioinformatics tools for data analysis. Software tools are accessible through package managers and/or containerized for compatibility, reproducibility, and ease of updates. Workflow management systems streamline the analysis process. IT automation engines facilitate software installation, customization, and maintenance. Upon completion of the analysis, users gain access to downstream services for data FAIRification, deposition, and discoverability. These services will include a federated node of the EGA human genome-phenome archive (FEGA) for metadata browsing and data access control. Additionally, a service based on the Beacon protocol enables the discoverability of datasets hosted by the FEGA node. Results This integrated approach represents a significant leap forward in managing human genetic data infrastructure in Italy, providing a resource-efficient, easily maintainable, and scalable solution tailored for both research and healthcare applications. By combining secure data transfer mechanisms, state-of-the-art storage facilities, and a versatile computational environment, this system ensures the efficient handling of genetic data while upholding high standards of security and accessibility.
13-giu-2024
human genetic data; infrastructure
Settore BIO/11 - Biologia Molecolare
https://bioinformatics.it/script/abs.pl?ID=840
Development of a state of the art computational environment for handling human genetic data : the effort of ELIXIR-IT / C. Lo Giudice, F. Licciulli, G. Miniello, M. Moscatelli, S.N. Cox, A.S. Varvara, B. Fosso, M.A. Tangaro, R. Cilli, D. Traversa, G. Donvito, E. Capriotti, M. Chiara, F. Zambelli, G. Pesole. ((Intervento presentato al convegno BITS Annual Meeting tenutosi a Trento nel 2024.
Conference Object
File in questo prodotto:
File Dimensione Formato  
claudio-lo_giudice-abstract-840.pdf

accesso aperto

Tipologia: Publisher's version/PDF
Dimensione 43.57 kB
Formato Adobe PDF
43.57 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1084588
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact