Multi-omics data are of paramount importance in biomedicine, providing a comprehensive view of processes underlying disease. They are characterized by high dimensions and are hence affected by the so-called ”curse of dimensionality”, ultimately leading to unreliable estimates. This calls for effective Dimensionality Reduction (DR) techniques to embed the high-dimensional data into a lower-dimensional space. Though effective DR methods have been proposed so far, given the high dimension of the initial dataset unsupervised Feature Selection (FS) techniques are often needed prior to their application. Unfortunately, both unsupervised FS and DR techniques require the dimension of the lower dimensional space to be provided. This is a crucial choice, for which a well-accepted solution has not been defined yet. The Intrinsic Dimension (ID) of a dataset is defined as the minimum number of dimensions that allow representing the data without information loss. Therefore, the ID of a dataset is related to its informativeness and complexity. In this paper, after proposing a blocking ID estimation to leverage state-of-the-art (SOTA) ID estimate methods we present our DR pipeline, whose subsequent FS and DR steps are guided by the ID estimate.

Intrinsic-Dimension Analysis for Guiding Dimensionality Reduction in Multi-Omics Data / V. Guarino, J. Gliozzo, F. Clarelli, B. Pignolet, K. Misra, E. Mascia, G. Antonino, S. Santoro, L. Ferré, M. Cannizzaro, M. Sorosina, R. Liblau, M. Filippi, E. Mosca, F. Esposito, G. Valentini, E. Casiraghi - In: Proceedings of the 16th International Joint Conference on Biomedical Engineering Systems and Technologies. 3: Bioinformatics / [a cura di] H. Ali, N. Deng, A. Fred, H. Gamboa. - [s.l] : Scitepress, 2023. - ISBN 978-989-758-631-6. - pp. 243-251 (( Intervento presentato al 16. convegno International Joint Conference on Biomedical Engineering Systems and Technologies tenutosi a Lisbona nel 2023 [10.5220/0011775200003414].

Intrinsic-Dimension Analysis for Guiding Dimensionality Reduction in Multi-Omics Data

J. Gliozzo
Secondo
;
F. Esposito;G. Valentini
Penultimo
;
E. Casiraghi
Ultimo
2023

Abstract

Multi-omics data are of paramount importance in biomedicine, providing a comprehensive view of processes underlying disease. They are characterized by high dimensions and are hence affected by the so-called ”curse of dimensionality”, ultimately leading to unreliable estimates. This calls for effective Dimensionality Reduction (DR) techniques to embed the high-dimensional data into a lower-dimensional space. Though effective DR methods have been proposed so far, given the high dimension of the initial dataset unsupervised Feature Selection (FS) techniques are often needed prior to their application. Unfortunately, both unsupervised FS and DR techniques require the dimension of the lower dimensional space to be provided. This is a crucial choice, for which a well-accepted solution has not been defined yet. The Intrinsic Dimension (ID) of a dataset is defined as the minimum number of dimensions that allow representing the data without information loss. Therefore, the ID of a dataset is related to its informativeness and complexity. In this paper, after proposing a blocking ID estimation to leverage state-of-the-art (SOTA) ID estimate methods we present our DR pipeline, whose subsequent FS and DR steps are guided by the ID estimate.
Dimensionality Reduction; Intrinsic Dimensionality; Feature Selection; Feature Clustering; Omics Datasets
Settore INF/01 - Informatica
2023
Book Part (author)
File in questo prodotto:
File Dimensione Formato  
ID_DR_BioInf2023.pdf

accesso riservato

Tipologia: Publisher's version/PDF
Dimensione 432.46 kB
Formato Adobe PDF
432.46 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/957102
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact