Merging datasets of CyberSecurity incidents for fun and insight

Abbiati, G.; Ranise, S.; Schizzerotto, A.; Siena, A.

doi:10.3389/fdata.2020.521132

Providing an adequate assessment of their cyber-security posture requires companies and organisations to collect information about threats from a wide range of sources. One of such sources is history, intended as the knowledge about past cyber-security incidents, their size, type of attacks, industry sector and so on. Ideally, having a large enough dataset of past security incidents, it would be possible to analyze it with automated tools and draw conclusions that may help in preventing future incidents. Unfortunately, it seems that there are only a few publicly available datasets of this kind that are of good quality. The paper reports our initial efforts in collecting all publicly available security incidents datasets, and building a single, large dataset that can be used to draw statistically significant observations. In order to argue about its statistical quality, we analyze the resulting combined dataset against the original ones. Additionally, we perform an analysis of the combined dataset and compare our results with the existing literature. Finally, we present our findings, discuss the limitations of the proposed approach, and point out interesting research directions.

Merging datasets of CyberSecurity incidents for fun and insight / G. Abbiati, S. Ranise, A. Schizzerotto, A. Siena. - In: FRONTIERS IN BIG DATA. - ISSN 2624-909X. - 3(2021 Jan), pp. 521132.1-521132.13.

Merging datasets of CyberSecurity incidents for fun and insight

G. Abbiati;Ranise, Silvio;Schizzerotto, Antonio;Siena, Alberto

2021

Abstract

Providing an adequate assessment of their cyber-security posture requires companies and organisations to collect information about threats from a wide range of sources. One of such sources is history, intended as the knowledge about past cyber-security incidents, their size, type of attacks, industry sector and so on. Ideally, having a large enough dataset of past security incidents, it would be possible to analyze it with automated tools and draw conclusions that may help in preventing future incidents. Unfortunately, it seems that there are only a few publicly available datasets of this kind that are of good quality. The paper reports our initial efforts in collecting all publicly available security incidents datasets, and building a single, large dataset that can be used to draw statistically significant observations. In order to argue about its statistical quality, we analyze the resulting combined dataset against the original ones. Additionally, we perform an analysis of the combined dataset and compare our results with the existing literature. Finally, we present our findings, discuss the limitations of the proposed approach, and point out interesting research directions.

Scheda breve

Scheda completa

Scheda completa (DC)

	Parole chiave
	
			cyber security; data analysis; security incidents statistics; methodological framework; data breaches
		
	Settori scientifico-disciplinari dell'articolo
	
			Settore INF/01 - Informatica
		
	Data di pubblicazione
	
			gen-2021
		
	Rivista in ANCE
	
			FRONTIERS IN BIG DATA
		
	DOI
	
			https://dx.doi.org/10.3389/fdata.2020.521132
		
	Tipologia
	
			Article (author)
		
	Appare nelle tipologie:
	
			01 - Articolo su periodico

File in questo prodotto:

File	Dimensione	Formato
2021 Abbiati et al Frontiers.pdf accesso aperto Tipologia: Publisher's version/PDF Dimensione 1.17 MB Formato Adobe PDF Visualizza/Apri	1.17 MB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/811122

Citazioni

ND

3

2

IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca