The rise of the Big Data age made traditional solutions for data processing and analysis unsuitable due to the high computational complexity. To address this problem, novel solutions specifically-designed techniques to analyse Big Data have been recently presented. In this path, when such a large amount of data arrives in a streaming manner, a sequential mechanism for the Big Data analysis is required. In this paper we target the modelling of high-dimension datastreams through hidden Markov models (HMMs) and introduce a HMM-based solution, named h-HMM, suitable for datastreams characterized by high dimensions. The proposed h-HMM relies on a suitably-defined clustering algorithm (operating in the space of the datastream dimensions) to create clusters of highly uncorrelated dimensions of the datastreams (as requested by the theory of HMMs) and a two-layer hierarchy of HMMs modelling the datastreams of such clusters. Experimental results on both synthetic and real-world data confirm the advantages of the proposed solution.
Designing HMMs in the age of big data / C. Alippi, S. Ntalampiras, M. Roveri (ADVANCES IN INTELLIGENT SYSTEMS AND COMPUTING). - In: Advances in Big Data / [a cura di] P. Angelov, Y. Manolopoulos, L. Iliadis, A. Roy, M. Vellasco. - [s.l] : Springer Verlag, 2017 Oct 08. - ISBN 9783319478975. - pp. 120-130 (( Intervento presentato al 2. convegno International Neural Network Society Conference on Big Data tenutosi a Thessaloniki nel 2016 [10.1007/978-3-319-47898-2_13].
Designing HMMs in the age of big data
S. Ntalampiras;M. Roveri
2017
Abstract
The rise of the Big Data age made traditional solutions for data processing and analysis unsuitable due to the high computational complexity. To address this problem, novel solutions specifically-designed techniques to analyse Big Data have been recently presented. In this path, when such a large amount of data arrives in a streaming manner, a sequential mechanism for the Big Data analysis is required. In this paper we target the modelling of high-dimension datastreams through hidden Markov models (HMMs) and introduce a HMM-based solution, named h-HMM, suitable for datastreams characterized by high dimensions. The proposed h-HMM relies on a suitably-defined clustering algorithm (operating in the space of the datastream dimensions) to create clusters of highly uncorrelated dimensions of the datastreams (as requested by the theory of HMMs) and a two-layer hierarchy of HMMs modelling the datastreams of such clusters. Experimental results on both synthetic and real-world data confirm the advantages of the proposed solution.File | Dimensione | Formato | |
---|---|---|---|
Alippi2017_Chapter_DesigningHMMsInTheAgeOfBigData.pdf
accesso riservato
Tipologia:
Publisher's version/PDF
Dimensione
335.34 kB
Formato
Adobe PDF
|
335.34 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.