Transcriptomics studies have been facilitated by the development of microarray and RNA-Seq technologies, with thousands of expression datasets available for many species. However, the quality of data can be highly variable, making the combined analysis of different datasets difficult and unreliable. Most of the microarray data for Medicago truncatula, the barrel medic, have been stored and made publicly accessible on the web database Medicago truncatula Gene Expression atlas (MtGEA). The aim of this work is to ameliorate the quality of the MtGEA database through a general method based on logical and statistical relationships among parameters and conditions. The initial 716 columns available in the dataset were reduced to 607 by evaluating the quality of data through the sum of the expression levels over the entire transcriptome probes and Pearson correlation among hybridizations. The reduced dataset shows great improvements in the consistency of the data, with a reduction in both false positives and false negatives resulting from Pearson correlation and GO enrichment analysis among genes. The approach we used is of general validity and our intent is to extend the analysis to other plant microarray databases.

Cleaning the medicago microarray database to improve gene function analysis / F. Marzorati, C. Wang, G. Pavesi, L. Mizzi, P. Morandini. - In: PLANTS. - ISSN 2223-7747. - 10:6(2021 Jun), p. 1240.1240. [10.3390/plants10061240]

Cleaning the medicago microarray database to improve gene function analysis

F. Marzorati
Primo
Investigation
;
G. Pavesi
Membro del Collaboration Group
;
L. Mizzi
Penultimo
Membro del Collaboration Group
;
P. Morandini
Ultimo
Conceptualization
2021

Abstract

Transcriptomics studies have been facilitated by the development of microarray and RNA-Seq technologies, with thousands of expression datasets available for many species. However, the quality of data can be highly variable, making the combined analysis of different datasets difficult and unreliable. Most of the microarray data for Medicago truncatula, the barrel medic, have been stored and made publicly accessible on the web database Medicago truncatula Gene Expression atlas (MtGEA). The aim of this work is to ameliorate the quality of the MtGEA database through a general method based on logical and statistical relationships among parameters and conditions. The initial 716 columns available in the dataset were reduced to 607 by evaluating the quality of data through the sum of the expression levels over the entire transcriptome probes and Pearson correlation among hybridizations. The reduced dataset shows great improvements in the consistency of the data, with a reduction in both false positives and false negatives resulting from Pearson correlation and GO enrichment analysis among genes. The approach we used is of general validity and our intent is to extend the analysis to other plant microarray databases.
Correlation analysis; Functional genomics; Medicago; Microarray; MtGEA; R programming; Transcriptomics
Settore BIO/04 - Fisiologia Vegetale
Settore AGR/07 - Genetica Agraria
giu-2021
18-giu-2021
https://www.mdpi.com/2223-7747/10/6/1240
Article (author)
File in questo prodotto:
File Dimensione Formato  
Marzorati(2021)Medicago-microarray_plants.pdf

accesso aperto

Descrizione: Articolo
Tipologia: Publisher's version/PDF
Dimensione 3.84 MB
Formato Adobe PDF
3.84 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/890643
Citazioni
  • ???jsp.display-item.citation.pmc??? 0
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 1
social impact