In this paper, we address the challenge of clustering mixed-type data with temporal evolution by introducing the statistical jump model for mixed-type data. This novel framework incorporates regime persistence, enhancing interpretability and reducing the frequency of state switches, and efficiently handles missing data. The model is easily interpretable through its state-conditional medians and modes, making it accessible to practitioners and policymakers. We validate our approach through extensive simulation studies and an empirical application to air quality data, demonstrating its superiority in inferring persistent air quality regimes compared to the traditional air quality index. Our contributions include a robust method for mixed-type temporal clustering, effective missing data management, and practical insights for environmental monitoring.

Statistical jump model for mixed-type data with missing data imputation / F. Cortese, A. Pievatolo. - In: ADVANCES IN DATA ANALYSIS AND CLASSIFICATION. - ISSN 1862-5347. - (2025), pp. 1-25. [Epub ahead of print] [10.1007/s11634-025-00631-y]

Statistical jump model for mixed-type data with missing data imputation

F. Cortese
Primo
;
2025

Abstract

In this paper, we address the challenge of clustering mixed-type data with temporal evolution by introducing the statistical jump model for mixed-type data. This novel framework incorporates regime persistence, enhancing interpretability and reducing the frequency of state switches, and efficiently handles missing data. The model is easily interpretable through its state-conditional medians and modes, making it accessible to practitioners and policymakers. We validate our approach through extensive simulation studies and an empirical application to air quality data, demonstrating its superiority in inferring persistent air quality regimes compared to the traditional air quality index. Our contributions include a robust method for mixed-type temporal clustering, effective missing data management, and practical insights for environmental monitoring.
Environmental monitoring; Missing data; Mixed-type data; Regime-switching models; Unsupervised learning;
Settore STAT-01/A - Statistica
2025
25-mar-2025
Article (author)
File in questo prodotto:
File Dimensione Formato  
unpaywall-bitstream-1608246011.pdf

accesso aperto

Tipologia: Publisher's version/PDF
Licenza: Creative commons
Dimensione 1.62 MB
Formato Adobe PDF
1.62 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1179142
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 1
  • OpenAlex 2
social impact