In several real-world node label prediction problems on graphs, in fields ranging from computational biology to World Wide Web analysis, nodes can be partitioned into categories different from the classes to be predicted, on the basis of their characteristics or their common properties. Such partitions may provide further information about node classification that classical machine learning algorithms do not take into account. We introduce a novel family of parametric Hopfield networks (m-category Hopfield networks) and a novel algorithm (Hopfield multi-category — HoMCat ), designed to appropriately exploit the presence of property-based partitions of nodes into multiple categories. Moreover, the proposed model adopts a cost-sensitive learning strategy to prevent the remarkable decay in performance usually observed when instance labels are unbalanced, that is, when one class of labels is highly underrepresented than the other one. We validate the proposed model on both synthetic and real-world data, in the context of multi-species function prediction, where the classes to be predicted are the Gene Ontology terms and the categories the different species in the multi-species protein network. We carried out an intensive experimental validation, which on the one hand compares HoMCat with several state-of-the-art graph-based algorithms, and on the other hand reveals that exploiting meaningful prior partitions of input data can substantially improve classification performances.

Learning node labels with multi-category Hopfield networks / M. Frasca, S. Bassis, G. Valentini. - In: NEURAL COMPUTING & APPLICATIONS. - ISSN 0941-0643. - (2015 Jun 23). [Epub ahead of print] [10.1007/s00521-015-1965-1]

Learning node labels with multi-category Hopfield networks

M. Frasca
Primo
;
S. Bassis
Secondo
;
G. Valentini
Ultimo
2015

Abstract

In several real-world node label prediction problems on graphs, in fields ranging from computational biology to World Wide Web analysis, nodes can be partitioned into categories different from the classes to be predicted, on the basis of their characteristics or their common properties. Such partitions may provide further information about node classification that classical machine learning algorithms do not take into account. We introduce a novel family of parametric Hopfield networks (m-category Hopfield networks) and a novel algorithm (Hopfield multi-category — HoMCat ), designed to appropriately exploit the presence of property-based partitions of nodes into multiple categories. Moreover, the proposed model adopts a cost-sensitive learning strategy to prevent the remarkable decay in performance usually observed when instance labels are unbalanced, that is, when one class of labels is highly underrepresented than the other one. We validate the proposed model on both synthetic and real-world data, in the context of multi-species function prediction, where the classes to be predicted are the Gene Ontology terms and the categories the different species in the multi-species protein network. We carried out an intensive experimental validation, which on the one hand compares HoMCat with several state-of-the-art graph-based algorithms, and on the other hand reveals that exploiting meaningful prior partitions of input data can substantially improve classification performances.
Binary classification; Biological networks; Multi-category Hopfield network; Protein function prediction; Unbalanced graphs
Settore INF/01 - Informatica
23-giu-2015
Article (author)
File in questo prodotto:
File Dimensione Formato  
homcat_rev1.pdf

accesso aperto

Descrizione: Pre-print prima revisione dell'articolo.
Tipologia: Pre-print (manoscritto inviato all'editore)
Dimensione 291.13 kB
Formato Adobe PDF
291.13 kB Adobe PDF Visualizza/Apri
art%3A10.1007%2Fs00521-015-1965-1.pdf

accesso riservato

Tipologia: Publisher's version/PDF
Dimensione 900.24 kB
Formato Adobe PDF
900.24 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/288461
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 7
  • ???jsp.display-item.citation.isi??? 5
social impact