The rise of the Big Data age made traditional solutions for data processing and analysis unsuitable due to the high computational complexity. To address this problem, novel solutions specifically-designed techniques to analyse Big Data have been recently presented. In this path, when such a large amount of data arrives in a streaming manner, a sequential mechanism for the Big Data analysis is required. In this paper we target the modelling of high-dimension datastreams through hidden Markov models (HMMs) and introduce a HMM-based solution, named h-HMM, suitable for datastreams characterized by high dimensions. The proposed h-HMM relies on a suitably-defined clustering algorithm (operating in the space of the datastream dimensions) to create clusters of highly uncorrelated dimensions of the datastreams (as requested by the theory of HMMs) and a two-layer hierarchy of HMMs modelling the datastreams of such clusters. Experimental results on both synthetic and real-world data confirm the advantages of the proposed solution.

Designing HMMs in the age of big data / C. Alippi, S. Ntalampiras, M. Roveri (ADVANCES IN INTELLIGENT SYSTEMS AND COMPUTING). - In: Advances in Big Data / [a cura di] P. Angelov, Y. Manolopoulos, L. Iliadis, A. Roy, M. Vellasco. - [s.l] : Springer Verlag, 2017 Oct 08. - ISBN 9783319478975. - pp. 120-130 (( Intervento presentato al 2. convegno International Neural Network Society Conference on Big Data tenutosi a Thessaloniki nel 2016 [10.1007/978-3-319-47898-2_13].

Designing HMMs in the age of big data

S. Ntalampiras;M. Roveri
2017

Abstract

The rise of the Big Data age made traditional solutions for data processing and analysis unsuitable due to the high computational complexity. To address this problem, novel solutions specifically-designed techniques to analyse Big Data have been recently presented. In this path, when such a large amount of data arrives in a streaming manner, a sequential mechanism for the Big Data analysis is required. In this paper we target the modelling of high-dimension datastreams through hidden Markov models (HMMs) and introduce a HMM-based solution, named h-HMM, suitable for datastreams characterized by high dimensions. The proposed h-HMM relies on a suitably-defined clustering algorithm (operating in the space of the datastream dimensions) to create clusters of highly uncorrelated dimensions of the datastreams (as requested by the theory of HMMs) and a two-layer hierarchy of HMMs modelling the datastreams of such clusters. Experimental results on both synthetic and real-world data confirm the advantages of the proposed solution.
Cluster Algorithm; Cluster Setting; State Transition Probability Matrix; Diagonal Covariance Matrice; Uncorrelated Feature
Settore INF/01 - Informatica
8-ott-2017
Artificial Intelligence Journal (Elsevier)
Book Part (author)
File in questo prodotto:
File Dimensione Formato  
Alippi2017_Chapter_DesigningHMMsInTheAgeOfBigData.pdf

accesso riservato

Tipologia: Publisher's version/PDF
Dimensione 335.34 kB
Formato Adobe PDF
335.34 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/615144
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? ND
social impact