Chain Graph Models (CGs) are a widely used tool to describe the conditional independence relationships among a set of variables. One of the advantages lies in the possible use undirected and directed arcs to link vertices representing variables in the graph. There are four ways to read off the conditional independencies from a chain graph. Each way differs from the other in the way of interpret the missing (un)directed arcs, (see Drton 2009). Different problems can be address with different CGs, however often it is not clear which type of CGs is the best in order to describe the multivariate system of relationships underlying the selected variables. In this work, we propose a learning algorithm, based on a Monte Carlo procedure, that consider the system of independencies underlying all four CGs and select the type and the graph which optimize a score function. When we handle with categorical variables, we take advantage of the marginal models (Bergsma and Rudas, 2002) to parametrize the joint and marginal probability distribution of the variables. Unlikely, Bergsma and Rudas, 2002 showed that particular combinations of conditional independences have no a smooth parametrization. Nicolussi and Colombi, 2013 and 2017, provide the condition according to (any type of) CG admits a smooth parametrization. In the learning procedure we consider only the smooth CGs, that is they admit a smooth parametrization. This approach is implemented to study the poverty status and particularly how this one can be affected from a group of selected variables. We took advantage of the cross-section data sets of Hungarian Household. This analysis highlighted a strong effect of the considered social variables on the poverty status.

Smooth Chain Graph Model of type II: a learning procedure / F. Nicolussi. - In: CHAOTIC MODELING AND SIMULATION. - ISSN 2241-0503. - 2020:3(2020 Jul), pp. 175-184.

Smooth Chain Graph Model of type II: a learning procedure

Abstract

Chain Graph Models (CGs) are a widely used tool to describe the conditional independence relationships among a set of variables. One of the advantages lies in the possible use undirected and directed arcs to link vertices representing variables in the graph. There are four ways to read off the conditional independencies from a chain graph. Each way differs from the other in the way of interpret the missing (un)directed arcs, (see Drton 2009). Different problems can be address with different CGs, however often it is not clear which type of CGs is the best in order to describe the multivariate system of relationships underlying the selected variables. In this work, we propose a learning algorithm, based on a Monte Carlo procedure, that consider the system of independencies underlying all four CGs and select the type and the graph which optimize a score function. When we handle with categorical variables, we take advantage of the marginal models (Bergsma and Rudas, 2002) to parametrize the joint and marginal probability distribution of the variables. Unlikely, Bergsma and Rudas, 2002 showed that particular combinations of conditional independences have no a smooth parametrization. Nicolussi and Colombi, 2013 and 2017, provide the condition according to (any type of) CG admits a smooth parametrization. In the learning procedure we consider only the smooth CGs, that is they admit a smooth parametrization. This approach is implemented to study the poverty status and particularly how this one can be affected from a group of selected variables. We took advantage of the cross-section data sets of Hungarian Household. This analysis highlighted a strong effect of the considered social variables on the poverty status.
Scheda breve Scheda completa Scheda completa (DC)
Settore SECS-S/01 - Statistica
lug-2020
Article (author)
File in questo prodotto:
File
2020_CMSIM_Nicolussi-175-184.pdf

accesso aperto

Tipologia: Publisher's version/PDF
Dimensione 333.52 kB
Utilizza questo identificativo per citare o creare un link a questo documento: `https://hdl.handle.net/2434/786138`