Multi-omics data have revolutionized biomedical research by providing a comprehensive understanding of biological systems and the molecular mechanisms of disease development. However, analyzing multi-omics data is challenging due to high dimensionality and limited sample sizes, necessitating proper data-reduction pipelines to ensure reliable analyses. Additionally, its multimodal nature requires effective data-integration pipelines. While several dimensionality reduction and data fusion algorithms have been proposed, crucial aspects are often overlooked. Specifically, the choice of projection space dimension is typically heuristic and uniformly applied across all omics, neglecting the unique high dimension small sample size challenges faced by individual omics. This paper introduces a novel multi-modal dimensionality reduction pipeline tailored to individual views. By leveraging intrinsic dimensionality estimators, we assess the curse-of-dimensionality impact on each view and propose a two-step reduction strategy for significantly affected views, combining feature selection with feature extraction. Compared to traditional uniform reduction pipelines in a crucial and supervised multi-omics analysis setting, our approach shows significant improvement. Additionally, we explore three effective unsupervised multi-omics data fusion methods rooted in the main data fusion strategies to gain insights into their performance under crucial, yet overlooked, settings.

Intrinsic-dimension analysis for guiding dimensionality reduction and data fusion in multi-omics data processing / J. Gliozzo, M. Soto-Gomez, V. Guarino, A. Bonometti, A. Cabri, E. Cavalleri, J. Reese, P.N. Robinson, M. Mesiti, G. Valentini, E. Casiraghi. - In: ARTIFICIAL INTELLIGENCE IN MEDICINE. - ISSN 0933-3657. - 160:(2025 Feb), pp. 103049.1-103049.13. [10.1016/j.artmed.2024.103049]

Intrinsic-dimension analysis for guiding dimensionality reduction and data fusion in multi-omics data processing

J. Gliozzo
Co-primo
;
M. Soto-Gomez
Co-primo
;
A. Cabri;E. Cavalleri;M. Mesiti;G. Valentini
Penultimo
;
E. Casiraghi
Ultimo
2025

Abstract

Multi-omics data have revolutionized biomedical research by providing a comprehensive understanding of biological systems and the molecular mechanisms of disease development. However, analyzing multi-omics data is challenging due to high dimensionality and limited sample sizes, necessitating proper data-reduction pipelines to ensure reliable analyses. Additionally, its multimodal nature requires effective data-integration pipelines. While several dimensionality reduction and data fusion algorithms have been proposed, crucial aspects are often overlooked. Specifically, the choice of projection space dimension is typically heuristic and uniformly applied across all omics, neglecting the unique high dimension small sample size challenges faced by individual omics. This paper introduces a novel multi-modal dimensionality reduction pipeline tailored to individual views. By leveraging intrinsic dimensionality estimators, we assess the curse-of-dimensionality impact on each view and propose a two-step reduction strategy for significantly affected views, combining feature selection with feature extraction. Compared to traditional uniform reduction pipelines in a crucial and supervised multi-omics analysis setting, our approach shows significant improvement. Additionally, we explore three effective unsupervised multi-omics data fusion methods rooted in the main data fusion strategies to gain insights into their performance under crucial, yet overlooked, settings.
Data fusion; Dimensionality reduction; Feature extraction; Feature selection; Intrinsic dimensionality; Multi-omics datasets
Settore INFO-01/A - Informatica
   Adaptive AI methods for Digital Health (AIDH)
   AIDH
   POLITECNICO DI MILANO

   European Learning and Intelligent Systems Excellence (ELISE)
   ELISE
   EUROPEAN COMMISSION
   H2020
   951847
feb-2025
11-dic-2024
Article (author)
File in questo prodotto:
File Dimensione Formato  
AIIM_1_s2.0_S0933365724002914_main.pdf

accesso aperto

Descrizione: Article
Tipologia: Publisher's version/PDF
Dimensione 2.29 MB
Formato Adobe PDF
2.29 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1124235
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 0
social impact