Transcriptome‑wide predictions of RNA 5‑methyl‑cytosine (m⁵C) sites for the human reference transcriptome (GENCODE v45, GRCh38). Predictions were generated with the Bi-GRU model described in: Saitto, E., Casiraghi, E., Paccanaro, A. & Valentini, G. AI methods and biologically informed data curation enable accurate RNA m⁵C prediction. bioRxiv (September 2025). https://doi.org/10.1101/xxxxxxx `m5C_predictions.tsv.gz` and `m5C_predictions.xlsx` are tables with the following columns: Transcript‑level identifiers: `transcript_id`, `gene_id`, `gene_name`, `transcript_type`, `tags`. `position`: zero‑based coordinate of the cytosine within the transcript sequence. `Type`: predicted methyltransferase class – I (NSUN2), II (NSUN6), III (NSUN5), IV (NSUN1). `probability`: probability assigned by the model. `in_train_or_test_sets`: `TRUE` if the 51‑nt window centred on this cytosine was present in the training or test set; `FALSE` otherwise. The file `gene_enrichment.tar` contains enriched terms across different onthologies for genes with predicted m⁵C sites by each NSUN enzyme. In particular, we retained the highest-scoring transcriptome-wide m5C sites per methyltransferase, omitting any site used during training or testing—and queried g:Profiler against GO, KEGG, Reactome and the Human Phenotype Ontology.

Predicted RNA m⁵C sites across the human transcriptome (GRCh38 GENCODE v45) / E. Saitto, E.C.. - (2025). [10.5281/zenodo.16629377]

Predicted RNA m⁵C sites across the human transcriptome (GRCh38 GENCODE v45)

E. Casiraghi;G. Valentini
2025

Abstract

Transcriptome‑wide predictions of RNA 5‑methyl‑cytosine (m⁵C) sites for the human reference transcriptome (GENCODE v45, GRCh38). Predictions were generated with the Bi-GRU model described in: Saitto, E., Casiraghi, E., Paccanaro, A. & Valentini, G. AI methods and biologically informed data curation enable accurate RNA m⁵C prediction. bioRxiv (September 2025). https://doi.org/10.1101/xxxxxxx `m5C_predictions.tsv.gz` and `m5C_predictions.xlsx` are tables with the following columns: Transcript‑level identifiers: `transcript_id`, `gene_id`, `gene_name`, `transcript_type`, `tags`. `position`: zero‑based coordinate of the cytosine within the transcript sequence. `Type`: predicted methyltransferase class – I (NSUN2), II (NSUN6), III (NSUN5), IV (NSUN1). `probability`: probability assigned by the model. `in_train_or_test_sets`: `TRUE` if the 51‑nt window centred on this cytosine was present in the training or test set; `FALSE` otherwise. The file `gene_enrichment.tar` contains enriched terms across different onthologies for genes with predicted m⁵C sites by each NSUN enzyme. In particular, we retained the highest-scoring transcriptome-wide m5C sites per methyltransferase, omitting any site used during training or testing—and queried g:Profiler against GO, KEGG, Reactome and the Human Phenotype Ontology.
2025
Settore INFO-01/A - Informatica
https://zenodo.org/records/16629378
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1255276
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex ND
social impact