Many applicative studies deal with multinomial responses and hierarchical data. Performing clustering at the highest level of grouping, in multilevel multinomial regression, is also often of interest. In this study we analyse Politecnico di Milano data with the aim of profiling students, modelling their probabilities of belonging to different categories and considering their nested structure within engineering degree programmes. In particular, we are interested in clustering degree programmes standing on their effects on different types of student career. To this end, we propose an EM algorithm for implementing semiparametric mixed-effects models dealing with a multinomial response. The novel semiparametric approach assumes the random effects to follow a multivariate discrete distribution with an a priori unknown number of support points, that is, allowed to differ across response categories. The advantage of this modelling is twofold: the discrete distribution on random effects allows, first, to express the marginal density as a weighted sum, avoiding numerical problems in the integration step, typical of the parametric approach, and, second, to identify a latent structure at the highest level of the hierarchy where groups are clustered into subpopulations.

Semiparametric multinomial mixed-effects models: a University student profiling tool / C. Masci, F. Ieva, A.M. Paganoni. - In: THE ANNALS OF APPLIED STATISTICS. - ISSN 1932-6157. - 16:3(2022 Sep), pp. 1608-1632. [10.1214/21-AOAS1559]

Semiparametric multinomial mixed-effects models: a University student profiling tool

C. Masci
Primo
;
2022

Abstract

Many applicative studies deal with multinomial responses and hierarchical data. Performing clustering at the highest level of grouping, in multilevel multinomial regression, is also often of interest. In this study we analyse Politecnico di Milano data with the aim of profiling students, modelling their probabilities of belonging to different categories and considering their nested structure within engineering degree programmes. In particular, we are interested in clustering degree programmes standing on their effects on different types of student career. To this end, we propose an EM algorithm for implementing semiparametric mixed-effects models dealing with a multinomial response. The novel semiparametric approach assumes the random effects to follow a multivariate discrete distribution with an a priori unknown number of support points, that is, allowed to differ across response categories. The advantage of this modelling is twofold: the discrete distribution on random effects allows, first, to express the marginal density as a weighted sum, avoiding numerical problems in the integration step, typical of the parametric approach, and, second, to identify a latent structure at the highest level of the hierarchy where groups are clustered into subpopulations.
higher education; multinomial mixed-effects regression; semiparametric statistics; unsupervised clustering
Settore STAT-01/A - Statistica
set-2022
Article (author)
File in questo prodotto:
File Dimensione Formato  
AOAS_2022.pdf

accesso riservato

Tipologia: Publisher's version/PDF
Dimensione 669.42 kB
Formato Adobe PDF
669.42 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1148346
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 5
  • ???jsp.display-item.citation.isi??? 6
  • OpenAlex ND
social impact