Link prediction in Online Social Networks—OSNs—has been the focus of numerous studies in the machine learning community. A successful machine learning-based solution for this task needs to (i) leverage global and local properties of the graph structure surrounding links; (ii) leverage the content produced by OSN users; and (iii) allow their representations to change over time, as thousands of new links between users and new content like textual posts, comments, images and videos are created/uploaded every month. Current works have successfully leveraged the structural information but only a few have also taken into account the textual content and/or the dynamicity of network structure and node attributes. In this paper, we propose a methodology based on temporal graph neural networks to handle the challenges described above. To understand the impact of textual content on this task, we provide a novel pipeline to include textual information alongside the structural one with the usage of BERT language models, dense preprocessing layers, and an effective post-processing decoder. We conducted the evaluation on a novel dataset gathered from an emerging blockchain-based online social network, using a live-update setting that takes into account the evolving nature of data and models. The dataset serves as a useful testing ground for link prediction evaluation because it provides high-resolution temporal information on link creation and textual content, characteristics hard to find in current benchmark datasets. Our results show that temporal graph learning is a promising solution for dynamic link prediction with text. Indeed, combining textual features and dynamic Graph Neural Networks—GNNs—leads to the best performances over time. On average, the textual content can enhance the performance of a dynamic GNN by 3.1% and, as the collection of documents increases in size over time, help even models that do not consider the structural information of the network.
Temporal graph learning for dynamic link prediction with text in online social networks / M. Dileo, M. Zignani, S. Gaito. - In: MACHINE LEARNING. - ISSN 0885-6125. - (2023), pp. 1-20. [Epub ahead of print] [10.1007/s10994-023-06475-x]
Temporal graph learning for dynamic link prediction with text in online social networks
M. Dileo
Primo
;M. ZignaniPenultimo
;S. GaitoUltimo
2023
Abstract
Link prediction in Online Social Networks—OSNs—has been the focus of numerous studies in the machine learning community. A successful machine learning-based solution for this task needs to (i) leverage global and local properties of the graph structure surrounding links; (ii) leverage the content produced by OSN users; and (iii) allow their representations to change over time, as thousands of new links between users and new content like textual posts, comments, images and videos are created/uploaded every month. Current works have successfully leveraged the structural information but only a few have also taken into account the textual content and/or the dynamicity of network structure and node attributes. In this paper, we propose a methodology based on temporal graph neural networks to handle the challenges described above. To understand the impact of textual content on this task, we provide a novel pipeline to include textual information alongside the structural one with the usage of BERT language models, dense preprocessing layers, and an effective post-processing decoder. We conducted the evaluation on a novel dataset gathered from an emerging blockchain-based online social network, using a live-update setting that takes into account the evolving nature of data and models. The dataset serves as a useful testing ground for link prediction evaluation because it provides high-resolution temporal information on link creation and textual content, characteristics hard to find in current benchmark datasets. Our results show that temporal graph learning is a promising solution for dynamic link prediction with text. Indeed, combining textual features and dynamic Graph Neural Networks—GNNs—leads to the best performances over time. On average, the textual content can enhance the performance of a dynamic GNN by 3.1% and, as the collection of documents increases in size over time, help even models that do not consider the structural information of the network.File | Dimensione | Formato | |
---|---|---|---|
s10994-023-06475-x.pdf
accesso aperto
Tipologia:
Publisher's version/PDF
Dimensione
1.11 MB
Formato
Adobe PDF
|
1.11 MB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.