IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca

In the last decade we have assisted to the great ICT development whose main effects have been translated into an increasing data collection for administrative agencies and a considerable improvement of their quality. On one hand administrative data are directly available, inexpensive and typically encompass large populations. On the other hand this type of data presents some problems which regard accuracy and completeness since they are collected for administrative aims. In order to study such complex and high-dimensional data-sets, whose size defies simplistic analysis, many statistical and computational tools have been developed. As well known in statistical literature a big quantity of statistical units can lead to biased significance effects. We suggest an innovative statistical method to handle large administrative data-sets. It is based on size reduction obtained through a specific sampling procedure. In order to validateour method, we compare the statistical analysis of the original dataset to the analysis of the sampled one. The data at our disposal are provided by Invalsi (National Committee for the Evaluation of the Italian Education Systems). This dataset is very innovative since it contains information about students characteristics and performances in Maths in all Lombardy region lower-secondary schools. The illustrative application proposes to investigate the existing relationships between the Maths scores and both individual and school factors. Given the hierarchical structure of data, a multilevel model has been built.

The significance effects problem for administrative data: a novel statistical approach / E. Raffinetti, I. Romeo. ((Intervento presentato al 2. convegno STMDA 2012 : stochastic modeling techniques and data analysis tenutosi a Chania nel 2012.

The significance effects problem for administrative data: a novel statistical approach

E. Raffinetti^Primo;I. Romeo

2012

Abstract

In the last decade we have assisted to the great ICT development whose main effects have been translated into an increasing data collection for administrative agencies and a considerable improvement of their quality. On one hand administrative data are directly available, inexpensive and typically encompass large populations. On the other hand this type of data presents some problems which regard accuracy and completeness since they are collected for administrative aims. In order to study such complex and high-dimensional data-sets, whose size defies simplistic analysis, many statistical and computational tools have been developed. As well known in statistical literature a big quantity of statistical units can lead to biased significance effects. We suggest an innovative statistical method to handle large administrative data-sets. It is based on size reduction obtained through a specific sampling procedure. In order to validateour method, we compare the statistical analysis of the original dataset to the analysis of the sampled one. The data at our disposal are provided by Invalsi (National Committee for the Evaluation of the Italian Education Systems). This dataset is very innovative since it contains information about students characteristics and performances in Maths in all Lombardy region lower-secondary schools. The illustrative application proposes to investigate the existing relationships between the Maths scores and both individual and school factors. Given the hierarchical structure of data, a multilevel model has been built.

Scheda breve

Scheda completa

Scheda completa (DC)

	Data di presentazione
	
				2012
			
	Parole chiave
	
				Administrative data; Large data-sets; Sampling; Significance; Multilevel modeling
			
	Settori scientifico-disciplinari dell'intervento (sola visualizzazione)
	
				Settore SECS-S/01 - Statistica
			
	Citazione
	
				The significance effects problem for administrative data: a novel statistical approach / E. Raffinetti, I. Romeo. ((Intervento presentato al 2. convegno STMDA 2012 : stochastic modeling techniques and data analysis tenutosi a Chania nel 2012.
			
	Tipologia
	
				Conference Object
			
	Appare nelle tipologie:
	
				14 - Intervento a convegno non pubblicato

File in questo prodotto:

File	Dimensione	Formato
SMTDA_2012.pdf accesso riservato Tipologia: Publisher's version/PDF Dimensione 158.87 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	158.87 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/219299

Citazioni

ND

ND

ND

ND

social impact