Degree-Normalization Improves Random-Walk-Based Embedding Accuracy in PPI Graphs

Cappelletti, L.; Taverni, S.; Fontana, T.; Joachimiak, M.P.; Reese, J.; Robinson, P.; Casiraghi, E.; Valentini, G.

doi:10.1007/978-3-031-34960-7_26

Among the many proposed solutions in graph embedding, traditional random walk-based embedding methods have shown their promise in several fields. However, when the graph contains high-degree nodes, random walks often neglect low- or middle-degree nodes and tend to prefer stepping through high-degree ones instead. This results in random-walk samples providing a very accurate topological representation of neighbourhoods surrounding high-degree nodes, which contrasts with a coarse-grained representation of neighbourhoods surrounding middle and low-degree nodes. This in turn affects the performance of the subsequent predictive models, which tend to overfit high-degree nodes and/or edges having high-degree nodes as one of the vertices. We propose a solution to this problem, which relies on a degree normalization approach. Experiments with popular RW-based embedding methods applied to edge prediction problems involving eight protein-protein interaction (PPI) graphs from the STRING database show the effectiveness of the proposed approach: degree normalization not only improves predictions but also provides more stable results, suggesting that our proposal has a regularization effect leading to a more robust convergence.

Degree-Normalization Improves Random-Walk-Based Embedding Accuracy in PPI Graphs / L. Cappelletti, S. Taverni, T. Fontana, M.P. Joachimiak, J. Reese, P. Robinson, E. Casiraghi, G. Valentini (LECTURE NOTES IN COMPUTER SCIENCE). - In: Bioinformatics and Biomedical Engineering / [a cura di] I. Rojas, O. Valenzuela, F. Rojas Ruiz,L. J. Herrera, F. Ortuño. - Cham : Springer, 2023. - ISBN 978-3-031-34959-1. - pp. 372-383 (( Intervento presentato al 10. convegno IWBBIO : International Work-Conference on Bioinformatics and Biomedical Engineering tenutosi a Meloneras : July 12–14 nel 2023 [10.1007/978-3-031-34960-7_26].

Degree-Normalization Improves Random-Walk-Based Embedding Accuracy in PPI Graphs

Cappelletti, Luca;Taverni, Stefano;Fontana, Tommaso;Joachimiak, Marcin P.;Reese, Justin;Robinson, Peter;E. Casiraghi^Penultimo;G. Valentini^Ultimo

2023

Abstract

Among the many proposed solutions in graph embedding, traditional random walk-based embedding methods have shown their promise in several fields. However, when the graph contains high-degree nodes, random walks often neglect low- or middle-degree nodes and tend to prefer stepping through high-degree ones instead. This results in random-walk samples providing a very accurate topological representation of neighbourhoods surrounding high-degree nodes, which contrasts with a coarse-grained representation of neighbourhoods surrounding middle and low-degree nodes. This in turn affects the performance of the subsequent predictive models, which tend to overfit high-degree nodes and/or edges having high-degree nodes as one of the vertices. We propose a solution to this problem, which relies on a degree normalization approach. Experiments with popular RW-based embedding methods applied to edge prediction problems involving eight protein-protein interaction (PPI) graphs from the STRING database show the effectiveness of the proposed approach: degree normalization not only improves predictions but also provides more stable results, suggesting that our proposal has a regularization effect leading to a more robust convergence.

Scheda breve

Scheda completa

Scheda completa (DC)

	Settori scientifico-disciplinari del contributo (sola visualizzazione)
	
				Settore INF/01 - Informatica
			
	Data di pubblicazione
	
				2023
			
	DOI
	
				https://dx.doi.org/10.1007/978-3-031-34960-7_26
			
	Tipologia
	
				Book Part (author)
			
	Appare nelle tipologie:
	
				03 - Contributo in volume

File in questo prodotto:

Non ci sono file associati a questo prodotto.

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/981708

Citazioni

ND

2

1

ND

IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca