The identification of clinically relevant variants has been moving through a data-driven path, and a large number of projects have been launched with the aim of favoring the implementation of NGS-based molecular profiling, particularly to define diagnostic, prognostic, and therapeutic pathways for cancer patients. It would be particularly useful to map the country-level mutational prevalence in CPGs in both the affected population and their relatives to assess the degree of heritability of specific mutations predisposing to different types of solid tumors such as breast, ovarian, and colon cancers. Additionally, a comprehensive profiling of the tumor genome of patients is needed to detect variants that may indicate a potential therapeutic response to new drug treatments. In order to do that, action is required within a structured network composed of the main oncology institutions across the country, through collaborative efforts as ACC, and the implementation of a centralized system for the storage, analysis, and interpretation of large volumes of data. For the purpose of centralized analysis and interpretation, we have developed multiple bioinformatic tools to: i) Perform rigorous QC on the reads used in the clinical setting to detect any coverage biases in clinically relevant regions. ii) Perform benchmarking and standardization of variant calling pipelines through the development of a software that simplifies fine-tuning and reduces the time required for the curation of potential FNs. This software should also harmonize the various variant notations in VCF files from major variant callers in a precise and reproducible manner. iii) Interpret and report variants based on ACMG guidelines (2015 for germline variants and 2017 for somatic variants) and develop ML-based algorithms for assessing the pathogenicity of variants with uncertain significance. This work converged into the development of multiple bioinformatic tools: i) Covdetect for detecting coverage biases in relevant genomic regions as where VOIs lie; ii) RecallME to harmonize different variant notation formats and to benchmark VC pipelines on a set of expected variants; iii) MolProBoard to visualize and interpret reported VOIs iv) RENOVO to assess the pathogenicity level of VUS. These tools were then bundled into a single suite for implementing NGS-derived data within the clinical settings, as presented in this thesis.

DEVELOPMENT OF A BIOINFORMATIC SUITE TO IMPROVE THE CLINICAL IMPLEMENTATION OF NEXT GENERATION SEQUENCING IN PRECISION MEDICINE FOR ONCOLOGY / G. Vozza ; added-supervisor: A. Magi ; internal advisor: S. Gandini ; tutor: L. Mazzarella ; co-tutor: P. G. Pelicci ; phd coordinator: S. Minucci. Dipartimento di Oncologia ed Emato-Oncologia, 2023. 35. ciclo, Anno Accademico 2023.

DEVELOPMENT OF A BIOINFORMATIC SUITE TO IMPROVE THE CLINICAL IMPLEMENTATION OF NEXT GENERATION SEQUENCING IN PRECISION MEDICINE FOR ONCOLOGY

G. Vozza
2023

Abstract

The identification of clinically relevant variants has been moving through a data-driven path, and a large number of projects have been launched with the aim of favoring the implementation of NGS-based molecular profiling, particularly to define diagnostic, prognostic, and therapeutic pathways for cancer patients. It would be particularly useful to map the country-level mutational prevalence in CPGs in both the affected population and their relatives to assess the degree of heritability of specific mutations predisposing to different types of solid tumors such as breast, ovarian, and colon cancers. Additionally, a comprehensive profiling of the tumor genome of patients is needed to detect variants that may indicate a potential therapeutic response to new drug treatments. In order to do that, action is required within a structured network composed of the main oncology institutions across the country, through collaborative efforts as ACC, and the implementation of a centralized system for the storage, analysis, and interpretation of large volumes of data. For the purpose of centralized analysis and interpretation, we have developed multiple bioinformatic tools to: i) Perform rigorous QC on the reads used in the clinical setting to detect any coverage biases in clinically relevant regions. ii) Perform benchmarking and standardization of variant calling pipelines through the development of a software that simplifies fine-tuning and reduces the time required for the curation of potential FNs. This software should also harmonize the various variant notations in VCF files from major variant callers in a precise and reproducible manner. iii) Interpret and report variants based on ACMG guidelines (2015 for germline variants and 2017 for somatic variants) and develop ML-based algorithms for assessing the pathogenicity of variants with uncertain significance. This work converged into the development of multiple bioinformatic tools: i) Covdetect for detecting coverage biases in relevant genomic regions as where VOIs lie; ii) RecallME to harmonize different variant notation formats and to benchmark VC pipelines on a set of expected variants; iii) MolProBoard to visualize and interpret reported VOIs iv) RENOVO to assess the pathogenicity level of VUS. These tools were then bundled into a single suite for implementing NGS-derived data within the clinical settings, as presented in this thesis.
12-dic-2023
Settore MED/04 - Patologia Generale
variant calling; pipeline benchmarking; recallme; bioinformatics; oncology; precision medicine; machine learning; breast cancer; clinical trials; biostatistics
MAZZARELLA, LUCA
MINUCCI, SAVERIO
Doctoral Thesis
DEVELOPMENT OF A BIOINFORMATIC SUITE TO IMPROVE THE CLINICAL IMPLEMENTATION OF NEXT GENERATION SEQUENCING IN PRECISION MEDICINE FOR ONCOLOGY / G. Vozza ; added-supervisor: A. Magi ; internal advisor: S. Gandini ; tutor: L. Mazzarella ; co-tutor: P. G. Pelicci ; phd coordinator: S. Minucci. Dipartimento di Oncologia ed Emato-Oncologia, 2023. 35. ciclo, Anno Accademico 2023.
File in questo prodotto:
File Dimensione Formato  
phd_unimi_R12747.pdf

embargo fino al 20/05/2025

Descrizione: Tesi Dottorato
Tipologia: Altro
Dimensione 17.61 MB
Formato Adobe PDF
17.61 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1018328
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact