Supervised machine learning methods when applied to the problem of automated protein-function prediction (AFP) require the availability of both positive examples (i.e., proteins which are known to possess a given protein function) and negative examples (corresponding to proteins not associated with that function). Unfortunately, publicly available proteome and genome data sources such as the Gene Ontology rarely store the functions not possessed by a protein. Thus the negative selection, consisting in identifying informative negative examples, is currently a central and challenging problem in AFP. Several heuristics have been proposed through the years to solve this problem; nevertheless, despite their effectiveness, to the best of our knowledge no previous existing work studied which protein features are more relevant to this task, that is, which protein features help more in discriminating reliable and unreliable negatives.
|Titolo:||Evaluating the impact of topological protein features on the negative examples selection|
|Parole Chiave:||Biological networks; Negative example selection; Protein features; Protein function prediction|
|Settore Scientifico Disciplinare:||Settore INF/01 - Informatica|
|Data di pubblicazione:||20-nov-2018|
|Digital Object Identifier (DOI):||10.1186/s12859-018-2385-x|
|Appare nelle tipologie:||01 - Articolo su periodico|