Motivation: Advances in RNA sequencing technologies have achieved an unprecedented accuracy in the quantification of mRNA isoforms, but our knowledge of isoform-specific functions has lagged behind. There is a need to understand the functional consequences of differential splicing, which could be supported by the generation of accurate and comprehensive isoform-specific Gene Ontology (GO) annotations. Results: We present Isopret (Isoform Interpretation), a method that uses expectation-maximization to infer isoform specific functions based on the relationship between sequence and functional isoform similarity. We predicted isoform-specific functional annotations for 85,617 isoforms of 17,900 protein-coding human genes spanning a range of 17,430 distinct GO terms. Comparison with a gold-standard corpus of manually annotated human isoform functions showed that isopret significantly outperforms state-of-the-art competing methods. We provide experimental evidence that functionally related isoforms predicted by isopret show a higher degree of domain sharing and expression correlation than functionally related genes. We also show that isoform sequence similarity correlates better with inferred isoform function than with gene level function. Availability and implementation: Source code, documentation, and resource files are freely available under a GNU3 license at https://github.com/TheJacksonLaboratory/isopretEM and https://zenodo.org/record/7594321. Supplementary information: Supplementary data are available at Bioinformatics online.
An expectation-maximization framework for comprehensive prediction of isoform-specific functions / G. Karlebach, L. Carmody, J.C. Sundaramurthi, E. Casiraghi, P. Hansen, J. Reese, C.J. Mungall, G. Valentini, P.N. Robinson. - In: BIOINFORMATICS. - ISSN 1367-4803. - 39:4(2023 Apr 03), pp. btad132.1-btad132.7. [10.1093/bioinformatics/btad132]
An expectation-maximization framework for comprehensive prediction of isoform-specific functions
E. Casiraghi;G. ValentiniPenultimo
;
2023
Abstract
Motivation: Advances in RNA sequencing technologies have achieved an unprecedented accuracy in the quantification of mRNA isoforms, but our knowledge of isoform-specific functions has lagged behind. There is a need to understand the functional consequences of differential splicing, which could be supported by the generation of accurate and comprehensive isoform-specific Gene Ontology (GO) annotations. Results: We present Isopret (Isoform Interpretation), a method that uses expectation-maximization to infer isoform specific functions based on the relationship between sequence and functional isoform similarity. We predicted isoform-specific functional annotations for 85,617 isoforms of 17,900 protein-coding human genes spanning a range of 17,430 distinct GO terms. Comparison with a gold-standard corpus of manually annotated human isoform functions showed that isopret significantly outperforms state-of-the-art competing methods. We provide experimental evidence that functionally related isoforms predicted by isopret show a higher degree of domain sharing and expression correlation than functionally related genes. We also show that isoform sequence similarity correlates better with inferred isoform function than with gene level function. Availability and implementation: Source code, documentation, and resource files are freely available under a GNU3 license at https://github.com/TheJacksonLaboratory/isopretEM and https://zenodo.org/record/7594321. Supplementary information: Supplementary data are available at Bioinformatics online.File | Dimensione | Formato | |
---|---|---|---|
btad132.pdf
accesso aperto
Tipologia:
Publisher's version/PDF
Dimensione
880.33 kB
Formato
Adobe PDF
|
880.33 kB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.