The graph classification problem consists, given a weighted graph and a partial node labeling, in extending the labels to all nodes. In many real-world context, such as Gene Function Prediction, the partial labeling is unbalanced: positive labels are much less than negatives. In this paper we present a new neural algorithm for predicting labels in presence of label imbalance. This algorithm is based on a family of Hopfield networks, described by 2 continuous parameters and 1 discrete parameter, and it consists of two main steps: 1) the network parameters are learnt through a cost-sensitive optimization procedure based on local search; 2) a suitable Hopfield network restricted to unlabeled nodes is considered and simulated. The reached equilibrium point induces the classification of unlabeled nodes. An experimental analysis on real-world unbalanced data in the context of genome-wide prediction of gene functions show the effectiveness of the proposed approach.
A Neural Procedure for Gene Function Prediction / M. Frasca, A. Bertoni, A. Sion - In: Neural Nets and Surroundings / [a cura di] B. Apolloni, S. Bassis, A. Esposito, F.C. Morabito. - [s.l] : Springer Berlin Heidelberg, 2013. - ISBN 978-3-642-35466-3. - pp. 179-188 (( Intervento presentato al XXII. convegno WIRN 2012 - 22nd Italian Workshop on Neural Networks tenutosi a Vietri sul Mare nel 2012 [10.1007/978-3-642-35467-0_19].
A Neural Procedure for Gene Function Prediction
M. FrascaPrimo
;A. BertoniSecondo
;
2013
Abstract
The graph classification problem consists, given a weighted graph and a partial node labeling, in extending the labels to all nodes. In many real-world context, such as Gene Function Prediction, the partial labeling is unbalanced: positive labels are much less than negatives. In this paper we present a new neural algorithm for predicting labels in presence of label imbalance. This algorithm is based on a family of Hopfield networks, described by 2 continuous parameters and 1 discrete parameter, and it consists of two main steps: 1) the network parameters are learnt through a cost-sensitive optimization procedure based on local search; 2) a suitable Hopfield network restricted to unlabeled nodes is considered and simulated. The reached equilibrium point induces the classification of unlabeled nodes. An experimental analysis on real-world unbalanced data in the context of genome-wide prediction of gene functions show the effectiveness of the proposed approach.Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.