The Cluster Weighted Robust Model (CWRM) is a recently introduced methodology to robustly estimate mixtures of regressions with random covariates. The CWRM allows users to flexibly perform regression clustering, safeguarding it against data contamination and spurious solutions. Nonetheless, the resulting solution depends on the chosen number of components in the mixture, the percentage of impartial trimming, the degree of heteroscedasticity of the errors around the regression lines and of the clusters in the explanatory variables. Therefore an appropriate model selection is crucially required. Such a complex modeling task may generate several “legitimate” solutions: each one derived from a distinct hyper-parameters specification. The present paper introduces a two step-monitoring procedure to help users effectively explore such a vast model space. The first phase uncovers the most appropriate percentages of trimming, whilst the second phase explores the whole set of solutions, conditioning on the outcome derived from the previous step. The final output singles out a set of “top” solutions, whose optimality, stability and validity is assessed. Novel graphical and computational tools - specifically tailored for the CWRM framework - will help the user make an educated choice among the optimal solutions. Three examples on real datasets showcase our proposal in action. Supplementary files for this article are available online.

Graphical and computational tools to guide parameter choice for the cluster weighted robust model / A. Cappozzo, L.A. García-Escudero, F. Greselin, A. Mayo-Iscar. - In: JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS. - ISSN 1061-8600. - 32:3(2023 Jul 03), pp. 1195-1214. [10.1080/10618600.2022.2154218]

Graphical and computational tools to guide parameter choice for the cluster weighted robust model

A. Cappozzo
Primo
;
2023

Abstract

The Cluster Weighted Robust Model (CWRM) is a recently introduced methodology to robustly estimate mixtures of regressions with random covariates. The CWRM allows users to flexibly perform regression clustering, safeguarding it against data contamination and spurious solutions. Nonetheless, the resulting solution depends on the chosen number of components in the mixture, the percentage of impartial trimming, the degree of heteroscedasticity of the errors around the regression lines and of the clusters in the explanatory variables. Therefore an appropriate model selection is crucially required. Such a complex modeling task may generate several “legitimate” solutions: each one derived from a distinct hyper-parameters specification. The present paper introduces a two step-monitoring procedure to help users effectively explore such a vast model space. The first phase uncovers the most appropriate percentages of trimming, whilst the second phase explores the whole set of solutions, conditioning on the outcome derived from the previous step. The final output singles out a set of “top” solutions, whose optimality, stability and validity is assessed. Novel graphical and computational tools - specifically tailored for the CWRM framework - will help the user make an educated choice among the optimal solutions. Three examples on real datasets showcase our proposal in action. Supplementary files for this article are available online.
Cluster-weighted modeling; Outliers; Trimmed BIC; Eigenvalue constraint; Monitoring; Model-based clustering; Robust estimation;
Settore SECS-S/01 - Statistica
3-lug-2023
9-gen-2023
Article (author)
File in questo prodotto:
File Dimensione Formato  
Graphical and Computational Tools to Guide Parameter Choice for the Cluster Weighted Robust Model.pdf

accesso riservato

Tipologia: Publisher's version/PDF
Dimensione 13.64 MB
Formato Adobe PDF
13.64 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
Graphical and Computational Tools to Guide Parameter Choice for the Cluster Weighted Robust Model_compressed.pdf

accesso riservato

Tipologia: Publisher's version/PDF
Dimensione 2.8 MB
Formato Adobe PDF
2.8 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
TO_PUBLISH_JCGS-CappozzoGreselinMayoIscarGarciaEscudero.pdf

accesso aperto

Tipologia: Pre-print (manoscritto inviato all'editore)
Dimensione 9.59 MB
Formato Adobe PDF
9.59 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1030201
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 1
social impact