Speeding up node label learning in unbalanced biomolecular networks through a parallel and sparse GPU­based Hopfield model

Petrini, A.; Notaro, M.; Gliozzo, J.; Valentini, G.; Grossi, G.; Frasca, M.

"Motivation" - In network biology and medicine several problems can be modeled as node label inference in partially labeled networks. Nodes are biomedical entities (e.g. genes, patients) and connections represent a notion of functional similarity between entities. Usually, the class being predicted is represented through a labeling vector highly unbalanced towards negatives: that is only few positive instances (those associated with the class) are available. This fosters the adoption of imbalanceaware methodologies to accurately predict node labels. In addition, input data can be largesized, since we may have millions of instances (e.g. in multispecies protein networks), thus requiring the design of efficient and scalable methodologies. To address these problems, a parametric neural algorithm based on the Hopfield model, COSNet [1,2,3], has been proposed, leveraging the minimization of a Hopfield network energy through the usual sequential dynamics to achieve an asymptotically stable attractor representing a valuable prediction. In this study, we propose a sparse and partially parallel implementation of COSNet, for sparse networks, which decomposes the input net in independent sets of neurons, each processed concurrently by hardware accelerators, like modern GPUs, while still keeping the overall dynamics sequential. "Methods" - The Hopfield dynamics is decomposed in independent tasks by solving the graph coloring problem, that is assigning colors to the graph vertices so that adjacent vertices receive different colors. Thus, the units of the neural network are split into clusters of independent neurons, which are sequentially updated, whereas the single units within each cluster are updated simultaneously. We simulate the algorithm on GPUs achieving a significant speed up with respect to the original sequential implementation and, at the same time, lowering memory requirements thanks to compressed memorization strategies, thus opening the possibility to face with prediction issues on big size instances. Also, a cooperative CPU multithreading – GPU model have been implemented, where the computations over different functional classes are carried independently by assigning each class to a different CPU thread. "Results" - We tested both COSNet and COSNetGPU on partially labeled networks containing genes belonging to D. melanogaster and Homo sapiens organisms for predicting respectively the Gene Ontology (GO) and the Human Phenotype Ontology (HPO) terms with 1050 annotated genes. The algorithm behavior has been measured in terms of execution time and memory consumption. Table 1 summarizes the results in term of speedup and memory usage, when performing a 3fold cross validation procedure. The results show significant reductions in both execution times and memory consumption, and interestingly the improvement factors increases more than linearly with the number of nodes/genes. This also corroborates the fact that the proposed implementation nicely scales on big data.

Speeding up node label learning in unbalanced biomolecular networks through a parallel and sparse GPUbased Hopfield model / A. Petrini, M. Notaro, J. Gliozzo, G. Valentini, G. Grossi, M. Frasca. ((Intervento presentato al 14. convegno Annual Meeting of the Bioinformatics Italian Society tenutosi a Cagliari nel 2017.

Speeding up node label learning in unbalanced biomolecular networks through a parallel and sparse GPUbased Hopfield model

A. Petrini^Primo;M. Notaro;J. Gliozzo;G. Valentini;G. Grossi;M. Frasca

2017

Abstract

"Motivation" - In network biology and medicine several problems can be modeled as node label inference in partially labeled networks. Nodes are biomedical entities (e.g. genes, patients) and connections represent a notion of functional similarity between entities. Usually, the class being predicted is represented through a labeling vector highly unbalanced towards negatives: that is only few positive instances (those associated with the class) are available. This fosters the adoption of imbalanceaware methodologies to accurately predict node labels. In addition, input data can be largesized, since we may have millions of instances (e.g. in multispecies protein networks), thus requiring the design of efficient and scalable methodologies. To address these problems, a parametric neural algorithm based on the Hopfield model, COSNet [1,2,3], has been proposed, leveraging the minimization of a Hopfield network energy through the usual sequential dynamics to achieve an asymptotically stable attractor representing a valuable prediction. In this study, we propose a sparse and partially parallel implementation of COSNet, for sparse networks, which decomposes the input net in independent sets of neurons, each processed concurrently by hardware accelerators, like modern GPUs, while still keeping the overall dynamics sequential. "Methods" - The Hopfield dynamics is decomposed in independent tasks by solving the graph coloring problem, that is assigning colors to the graph vertices so that adjacent vertices receive different colors. Thus, the units of the neural network are split into clusters of independent neurons, which are sequentially updated, whereas the single units within each cluster are updated simultaneously. We simulate the algorithm on GPUs achieving a significant speed up with respect to the original sequential implementation and, at the same time, lowering memory requirements thanks to compressed memorization strategies, thus opening the possibility to face with prediction issues on big size instances. Also, a cooperative CPU multithreading – GPU model have been implemented, where the computations over different functional classes are carried independently by assigning each class to a different CPU thread. "Results" - We tested both COSNet and COSNetGPU on partially labeled networks containing genes belonging to D. melanogaster and Homo sapiens organisms for predicting respectively the Gene Ontology (GO) and the Human Phenotype Ontology (HPO) terms with 1050 annotated genes. The algorithm behavior has been measured in terms of execution time and memory consumption. Table 1 summarizes the results in term of speedup and memory usage, when performing a 3fold cross validation procedure. The results show significant reductions in both execution times and memory consumption, and interestingly the improvement factors increases more than linearly with the number of nodes/genes. This also corroborates the fact that the proposed implementation nicely scales on big data.

Scheda breve

Scheda completa

Scheda completa (DC)

	Data di presentazione
	
				2017
			
	Settori scientifico-disciplinari dell'intervento (sola visualizzazione)
	
				Settore INF/01 - Informatica
			
	Citazione
	
				Speeding up node label learning in unbalanced biomolecular networks through a parallel and sparse GPU­based Hopfield model / A. Petrini, M. Notaro, J. Gliozzo, G. Valentini, G. Grossi, M. Frasca. ((Intervento presentato al 14. convegno Annual Meeting of the Bioinformatics Italian Society tenutosi a Cagliari nel 2017.
			
	Tipologia
	
				Conference Object
			
	Appare nelle tipologie:
	
				14 - Intervento a convegno non pubblicato

File in questo prodotto:

File	Dimensione	Formato
BITS17-gcosnet.pdf accesso aperto Tipologia: Post-print, accepted manuscript ecc. (versione accettata dall'editore) Dimensione 284.87 kB Formato Adobe PDF Visualizza/Apri	284.87 kB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1022608

Citazioni

ND

ND

ND

Nome	Dominio	Durata	Descrizione
s_.*	plu.mx	sessione	recupero grafico citazioni sociali da plumx
A_.*	core.ac.uk	7 giorni	recupero pubblicazioni consigliate per il pannello core-recommander
GS_.*	gstatic.com	richiesta http	visualizza grafico citazioni
CC_.*	creativecommons.org	richiesta http	visualizza licenza bitstream

IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca

Speeding up node label learning in unbalanced biomolecular networks through a parallel and sparse GPU­based Hopfield model

A. PetriniPrimo;M. Notaro;J. Gliozzo;G. Valentini;G. Grossi;M. Frasca

Primo

2017

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Pubblicazioni consigliate

Informazioni

Citazioni

social impact

Conferma cancellazione

Speeding up node label learning in unbalanced biomolecular networks through a parallel and sparse GPUbased Hopfield model

A. Petrini^Primo;M. Notaro;J. Gliozzo;G. Valentini;G. Grossi;M. Frasca

Scheda breve

Scheda completa

Scheda completa (DC)