Social Business Intelligence (SBI) is the discipline that combines corporate data with social content to let decision makers analyze the trends perceived from the environment. SBI poses research challenges in several areas, such as IR, data mining, and NLP; unfortunately, SBI research is often restrained by the lack of publicly-available, real-world data for experimenting approaches, and by the difficulties in determining a ground truth. To fill this gap we present SABINE, a modular dataset in the domain of European politics. SABINE includes 6 millions bilingual clips crawled from 50 000 web sources, each associated with metadata and sentiment scores; an ontology with 400 topics, their occurrences in the clips, and their mapping to DBpedia; two multidimensional cubes for analyzing and aggregating sentiment and semantic occurrences. We also propose a set of research challenges that can be addressed using SABINE; remarkably, the presence of an expert-validated ground truth ensures the possibility of testing approaches to the whole SBI process as well as to each single task.

SABINE: A multi-purpose dataset of semantically-annotated social content / S. Castano, A. Ferrara, E. Gallinucci, M. Golfarelli, S. Montanelli, L. Mosca, S. Rizzi, C. Vaccari (LECTURE NOTES IN COMPUTER SCIENCE). - In: The Semantic Web / [a cura di] D. Vrandečić, K. Bontcheva, M.C. Suárez-Figueroa, V. Presutti, I. Celino, M. Sabou, L.-A. Kaffee, E. Simperl. - [s.l] : Springer Verlag, 2018. - ISBN 9783030006679. - pp. 70-85 (( Intervento presentato al 17. convegno International Semantic Web Conference tenutosi a Monterey nel 2018 [10.1007/978-3-030-00668-6_5].

SABINE: A multi-purpose dataset of semantically-annotated social content

S. Castano;A. Ferrara;S. Montanelli;L. Mosca;
2018

Abstract

Social Business Intelligence (SBI) is the discipline that combines corporate data with social content to let decision makers analyze the trends perceived from the environment. SBI poses research challenges in several areas, such as IR, data mining, and NLP; unfortunately, SBI research is often restrained by the lack of publicly-available, real-world data for experimenting approaches, and by the difficulties in determining a ground truth. To fill this gap we present SABINE, a modular dataset in the domain of European politics. SABINE includes 6 millions bilingual clips crawled from 50 000 web sources, each associated with metadata and sentiment scores; an ontology with 400 topics, their occurrences in the clips, and their mapping to DBpedia; two multidimensional cubes for analyzing and aggregating sentiment and semantic occurrences. We also propose a set of research challenges that can be addressed using SABINE; remarkably, the presence of an expert-validated ground truth ensures the possibility of testing approaches to the whole SBI process as well as to each single task.
Dataset; Sentiment analysis; Social technologies; Text analysis; Theoretical Computer Science; Computer Science (all)
Settore INF/01 - Informatica
Settore ING-INF/05 - Sistemi di Elaborazione delle Informazioni
Book Part (author)
File in questo prodotto:
File Dimensione Formato  
Castano2018_Chapter_SABINEAMulti-purposeDatasetOfS.pdf

accesso riservato

Tipologia: Publisher's version/PDF
Dimensione 1.46 MB
Formato Adobe PDF
1.46 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

Caricamento pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/2434/599556
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 1
social impact