In this paper we introduce a novel multivariate concordance index that can be usefully employed to study the dependence between a response variable and a number of explanatory ones. In order to achieve this goal one can resort to some specific statistical tools such as the concordance curve and the Lorenz curves. Let us suppose to have a k-variate random vector (Y,X1, . . . , Xk−1) and let us describe the relationship among the response variable Y and the explanatory variables X1, . . . , Xk−1 through the multiple linear regression model. More precisely, let us suppose that the response variable Y assumes non-negative values. Furthermore, this approach will be applied when the most relevant explanatory variables have categorical nature and are always characterized by non-negative values representing the corresponding assigned label values. Once built the response variable Lorenz curve and its dual (obtained by ordering all the response variable values in a decreasing sense), one proceeds to the concordance curve construction defined as the set of ordered pairs (i/n, (1/nMY ) i j=1 y∗j ) where i = 1, . . . , n and MY is the variable Y mean. In particular y∗i represents the Y variable values ordered according to the ranks assigned to their respective estimates: we denote the concordance curve with C(Y |r(ˆyi)) which moves between the Y Lorenz curve and its dual. A multivariate concordance index can be provided: CY,X1,X2,...,Xk−1 = n−1 i=1 i/n − (1/(nMY )) i j=1 y∗j n−1 i=1 i/n − (1/(nMY )) i j=1 y(j) , where y(j) represents the first j, with j = 1, . . . , i, Y values ordered in an increasing sense. This index is of simple calculation, has some interesting properties and can be compared to known alternatives, such as the Plotnick index. Furthermore, it can be very useful as a measure of fit when the relevant explanatory variables have categorical nature because it is based on the response variable values ordered according to the ranks assigned to their corresponding estimated values rather than on the euclidean distance. If the index assumes positive values, then between Y and X1, . . . , Xk−1, there exists a linear dependence relation: the higher the value, the better. On the other hand, if the index assumes negative values, between Y and X1, . . . , Xk−1 does not exist a linear dependence relation meaning that the estimated linear regression model is not appropriate to fit the data. A similar construction of the aforementioned index may be shown when the response variable is ordinal.

Goodness of fit based on the Lorenz curves : a proposal / P. Giudici, E. Raffinetti - In: ASMDA 2011 : Applied stochastic models and data analysis conferenceDisco ottico. - Pisa : ETS, 2011. - ISBN 9788846730459. (( Intervento presentato al 14. convegno Applied stochastic models and data analysis conference : June 7-10 tenutosi a Roma nel 2011.

Goodness of fit based on the Lorenz curves : a proposal

E. Raffinetti
Ultimo
2011

Abstract

In this paper we introduce a novel multivariate concordance index that can be usefully employed to study the dependence between a response variable and a number of explanatory ones. In order to achieve this goal one can resort to some specific statistical tools such as the concordance curve and the Lorenz curves. Let us suppose to have a k-variate random vector (Y,X1, . . . , Xk−1) and let us describe the relationship among the response variable Y and the explanatory variables X1, . . . , Xk−1 through the multiple linear regression model. More precisely, let us suppose that the response variable Y assumes non-negative values. Furthermore, this approach will be applied when the most relevant explanatory variables have categorical nature and are always characterized by non-negative values representing the corresponding assigned label values. Once built the response variable Lorenz curve and its dual (obtained by ordering all the response variable values in a decreasing sense), one proceeds to the concordance curve construction defined as the set of ordered pairs (i/n, (1/nMY ) i j=1 y∗j ) where i = 1, . . . , n and MY is the variable Y mean. In particular y∗i represents the Y variable values ordered according to the ranks assigned to their respective estimates: we denote the concordance curve with C(Y |r(ˆyi)) which moves between the Y Lorenz curve and its dual. A multivariate concordance index can be provided: CY,X1,X2,...,Xk−1 = n−1 i=1 i/n − (1/(nMY )) i j=1 y∗j n−1 i=1 i/n − (1/(nMY )) i j=1 y(j) , where y(j) represents the first j, with j = 1, . . . , i, Y values ordered in an increasing sense. This index is of simple calculation, has some interesting properties and can be compared to known alternatives, such as the Plotnick index. Furthermore, it can be very useful as a measure of fit when the relevant explanatory variables have categorical nature because it is based on the response variable values ordered according to the ranks assigned to their corresponding estimated values rather than on the euclidean distance. If the index assumes positive values, then between Y and X1, . . . , Xk−1, there exists a linear dependence relation: the higher the value, the better. On the other hand, if the index assumes negative values, between Y and X1, . . . , Xk−1 does not exist a linear dependence relation meaning that the estimated linear regression model is not appropriate to fit the data. A similar construction of the aforementioned index may be shown when the response variable is ordinal.
Concordance curve; multivariate concordance index; Lorenz curves
Settore SECS-S/01 - Statistica
2011
Book Part (author)
File in questo prodotto:
File Dimensione Formato  
ASMDA_2011_2.pdf

accesso riservato

Tipologia: Post-print, accepted manuscript ecc. (versione accettata dall'editore)
Dimensione 160.83 kB
Formato Adobe PDF
160.83 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/171908
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact