The impact of sequence database choice on metaproteomic results in gut microbiota studies

Tanca, A.; Palomba, A.; Fraumene, C.; Pagnozzi, D.; Manghina, V.; Deligios, M.; Muth, T.; Rapp, E.; Martens, L.; Addis, M.F.; Uzzau, S.

doi:10.1186/s40168-016-0196-8

Background: Elucidating the role of gut microbiota in physiological and pathological processes has recently emerged as a key research aim in life sciences. In this respect, metaproteomics, the study of the whole protein complement of a microbial community, can provide a unique contribution by revealing which functions are actually being expressed by specific microbial taxa. However, its wide application to gut microbiota research has been hindered by challenges in data analysis, especially related to the choice of the proper sequence databases for protein identification. Results: Here, we present a systematic investigation of variables concerning database construction and annotation and evaluate their impact on human and mouse gut metaproteomic results. We found that both publicly available and experimental metagenomic databases lead to the identification of unique peptide assortments, suggesting parallel database searches as a mean to gain more complete information. In particular, the contribution of experimental metagenomic databases was revealed to be mandatory when dealing with mouse samples. Moreover, the use of a "merged" database, containing all metagenomic sequences from the population under study, was found to be generally preferable over the use of sample-matched databases. We also observed that taxonomic and functional results are strongly database-dependent, in particular when analyzing the mouse gut microbiota. As a striking example, the Firmicutes/Bacteroidetes ratio varied up to tenfold depending on the database used. Finally, assembling reads into longer contigs provided significant advantages in terms of functional annotation yields. Conclusions: This study contributes to identify host- and database-specific biases which need to be taken into account in a metaproteomic experiment, providing meaningful insights on how to design gut microbiota studies and to perform metaproteomic data analysis. In particular, the use of multiple databases and annotation tools has to be encouraged, even though this requires appropriate bioinformatic resources.

The impact of sequence database choice on metaproteomic results in gut microbiota studies / A. Tanca, A. Palomba, C. Fraumene, D. Pagnozzi, V. Manghina, M. Deligios, T. Muth, E. Rapp, L. Martens, M.F. Addis, S. Uzzau. - In: MICROBIOME. - ISSN 2049-2618. - 4:1(2016), p. 51.51.

The impact of sequence database choice on metaproteomic results in gut microbiota studies

A. Tanca;A. Palomba;C. Fraumene;D. Pagnozzi;V. Manghina;M. Deligios;T. Muth;E. Rapp;L. Martens;M.F. Addis^Penultimo;S. Uzzau

2016

Abstract

Background: Elucidating the role of gut microbiota in physiological and pathological processes has recently emerged as a key research aim in life sciences. In this respect, metaproteomics, the study of the whole protein complement of a microbial community, can provide a unique contribution by revealing which functions are actually being expressed by specific microbial taxa. However, its wide application to gut microbiota research has been hindered by challenges in data analysis, especially related to the choice of the proper sequence databases for protein identification. Results: Here, we present a systematic investigation of variables concerning database construction and annotation and evaluate their impact on human and mouse gut metaproteomic results. We found that both publicly available and experimental metagenomic databases lead to the identification of unique peptide assortments, suggesting parallel database searches as a mean to gain more complete information. In particular, the contribution of experimental metagenomic databases was revealed to be mandatory when dealing with mouse samples. Moreover, the use of a "merged" database, containing all metagenomic sequences from the population under study, was found to be generally preferable over the use of sample-matched databases. We also observed that taxonomic and functional results are strongly database-dependent, in particular when analyzing the mouse gut microbiota. As a striking example, the Firmicutes/Bacteroidetes ratio varied up to tenfold depending on the database used. Finally, assembling reads into longer contigs provided significant advantages in terms of functional annotation yields. Conclusions: This study contributes to identify host- and database-specific biases which need to be taken into account in a metaproteomic experiment, providing meaningful insights on how to design gut microbiota studies and to perform metaproteomic data analysis. In particular, the use of multiple databases and annotation tools has to be encouraged, even though this requires appropriate bioinformatic resources.

Scheda breve

Scheda completa

Scheda completa (DC)

	Parole chiave
	
				bioinformatics; gut microbiota; mass spectrometry; metagenomics; metaproteomics; microbiology; microbiology (medical)
			
	Settori scientifico-disciplinari dell'articolo (sola visualizzazione)
	
				Settore BIO/19 - Microbiologia Generale
			
	Data di pubblicazione
	
				2016
			
	Rivista in ANCE
	
				MICROBIOME
			
	DOI
	
				https://dx.doi.org/10.1186/s40168-016-0196-8
			
	Tipologia
	
				Article (author)
			
	Appare nelle tipologie:
	
				01 - Articolo su periodico

File in questo prodotto:

File	Dimensione	Formato
2016_Tanca_et_al_Microbiome.pdf accesso aperto Descrizione: Articolo principale Tipologia: Publisher's version/PDF Dimensione 1.68 MB Formato Adobe PDF Visualizza/Apri	1.68 MB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/492657

Citazioni

0

113

106

ND

IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca