In several real-world node label prediction problems on graphs, in fields ranging from computational biology to World Wide Web analysis, nodes can be partitioned into categories different from the classes to be predicted, on the basis of their characteristics or their common properties. Such partitions may provide further information about node classification that classical machine learning algorithms do not take into account. We introduce a novel family of parametric Hopfield networks (m-category Hopfield networks) and a novel algorithm (Hopfield multi-category — HoMCat ), designed to appropriately exploit the presence of property-based partitions of nodes into multiple categories. Moreover, the proposed model adopts a cost-sensitive learning strategy to prevent the remarkable decay in performance usually observed when instance labels are unbalanced, that is, when one class of labels is highly underrepresented than the other one. We validate the proposed model on both synthetic and real-world data, in the context of multi-species function prediction, where the classes to be predicted are the Gene Ontology terms and the categories the different species in the multi-species protein network. We carried out an intensive experimental validation, which on the one hand compares HoMCat with several state-of-the-art graph-based algorithms, and on the other hand reveals that exploiting meaningful prior partitions of input data can substantially improve classification performances.
Learning node labels with multi-category Hopfield networks / M. Frasca, S. Bassis, G. Valentini. - In: NEURAL COMPUTING & APPLICATIONS. - ISSN 0941-0643. - (2015 Jun 23). [Epub ahead of print] [10.1007/s00521-015-1965-1]
Learning node labels with multi-category Hopfield networks
M. FrascaPrimo
;S. BassisSecondo
;G. ValentiniUltimo
2015
Abstract
In several real-world node label prediction problems on graphs, in fields ranging from computational biology to World Wide Web analysis, nodes can be partitioned into categories different from the classes to be predicted, on the basis of their characteristics or their common properties. Such partitions may provide further information about node classification that classical machine learning algorithms do not take into account. We introduce a novel family of parametric Hopfield networks (m-category Hopfield networks) and a novel algorithm (Hopfield multi-category — HoMCat ), designed to appropriately exploit the presence of property-based partitions of nodes into multiple categories. Moreover, the proposed model adopts a cost-sensitive learning strategy to prevent the remarkable decay in performance usually observed when instance labels are unbalanced, that is, when one class of labels is highly underrepresented than the other one. We validate the proposed model on both synthetic and real-world data, in the context of multi-species function prediction, where the classes to be predicted are the Gene Ontology terms and the categories the different species in the multi-species protein network. We carried out an intensive experimental validation, which on the one hand compares HoMCat with several state-of-the-art graph-based algorithms, and on the other hand reveals that exploiting meaningful prior partitions of input data can substantially improve classification performances.File | Dimensione | Formato | |
---|---|---|---|
homcat_rev1.pdf
accesso aperto
Descrizione: Pre-print prima revisione dell'articolo.
Tipologia:
Pre-print (manoscritto inviato all'editore)
Dimensione
291.13 kB
Formato
Adobe PDF
|
291.13 kB | Adobe PDF | Visualizza/Apri |
art%3A10.1007%2Fs00521-015-1965-1.pdf
accesso riservato
Tipologia:
Publisher's version/PDF
Dimensione
900.24 kB
Formato
Adobe PDF
|
900.24 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.