rCASC[1] is a workflow for scRNA-Seq data analysis providing an integrated analysis environment that exploits Docker containerization to achieve both functional and computational reproducibility of the data analysis process. rCASC modular architecture consists of 39 Docker images, each one tailored to perform a specific function, e.g., quality control, clustering, and feature selection. While this Docker-based implementation ensures a reliable framework for long-term reproducibility, rCASC is currently available only as a stand-alone software with a custom GUI, or as a command-line tool. To improve its availability and accessibility, a porting of rCASC to Galaxy is in progress to provide end-users with the possibility to automatically download, deploy, and run rCASC within the familiar Galaxy environment over the cloud. This operation is non-trivial due to the internal architecture of rCASC, composed of highly interconnected, Galaxy unfriendly, containerized modules. The porting requires, for example, several tweaks to rCASC’s input/output functions, as well as specific configurations for Galaxy and the Docker engine. We have therefore identified a set of rules for rCASC/Galaxy integration that could be easily expanded and applied to the more general task of porting or designing containerized tools for cloud-based Galaxy instances. Following these integration rules, the first rCASC workflow (Fig:1) has been successfully ported to Galaxy. We are now working on the migration of the others. The final aim is to make rCASC available also as a Galaxy flavour into the Laniakea [3] Galaxy on-demand system, providing a medium to deploy a Galaxy instance pre-loaded with rCASC tools without effort. Ultimately, this will result in a more user-friendly instrument for reproducibility-oriented scRNA-Seq data analysis, that can also be seamlessly supported by the cloud resources offered by any Laniakea-based service.
Porting the rCASC workflow for scRNA-Seq data analysis to Galaxy and the Laniakea Galaxy on-demand system / P. Mandreoli, L. Alessandrì, M.A. Tangaro, R. Calogero, F. Zambelli. ((Intervento presentato al convegno Bioinformatics Community Conference tenutosi a online nel 2020.
Porting the rCASC workflow for scRNA-Seq data analysis to Galaxy and the Laniakea Galaxy on-demand system
P. Mandreoli;F. Zambelli
2020
Abstract
rCASC[1] is a workflow for scRNA-Seq data analysis providing an integrated analysis environment that exploits Docker containerization to achieve both functional and computational reproducibility of the data analysis process. rCASC modular architecture consists of 39 Docker images, each one tailored to perform a specific function, e.g., quality control, clustering, and feature selection. While this Docker-based implementation ensures a reliable framework for long-term reproducibility, rCASC is currently available only as a stand-alone software with a custom GUI, or as a command-line tool. To improve its availability and accessibility, a porting of rCASC to Galaxy is in progress to provide end-users with the possibility to automatically download, deploy, and run rCASC within the familiar Galaxy environment over the cloud. This operation is non-trivial due to the internal architecture of rCASC, composed of highly interconnected, Galaxy unfriendly, containerized modules. The porting requires, for example, several tweaks to rCASC’s input/output functions, as well as specific configurations for Galaxy and the Docker engine. We have therefore identified a set of rules for rCASC/Galaxy integration that could be easily expanded and applied to the more general task of porting or designing containerized tools for cloud-based Galaxy instances. Following these integration rules, the first rCASC workflow (Fig:1) has been successfully ported to Galaxy. We are now working on the migration of the others. The final aim is to make rCASC available also as a Galaxy flavour into the Laniakea [3] Galaxy on-demand system, providing a medium to deploy a Galaxy instance pre-loaded with rCASC tools without effort. Ultimately, this will result in a more user-friendly instrument for reproducibility-oriented scRNA-Seq data analysis, that can also be seamlessly supported by the cloud resources offered by any Laniakea-based service.File | Dimensione | Formato | |
---|---|---|---|
BCC2020_abstract_57.pdf
accesso aperto
Tipologia:
Pre-print (manoscritto inviato all'editore)
Dimensione
117.91 kB
Formato
Adobe PDF
|
117.91 kB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.