Identification of metabolic network models from incomplete high-throughput datasets

Berthoumieux, S.; Brilli, M.; De Jong, H.; Kahn, D.; Cinquemani, E.

doi:10.1093/bioinformatics/btr225

Motivation: High-throughput measurement techniques for metabolism and gene expression provide a wealth of information for the identification of metabolic network models. Yet, missing observations scattered over the dataset restrict the number of effectively available datapoints and make classical regression techniques inaccurate or inapplicable. Thorough exploitation of the data by identification techniques that explicitly cope with missing observations is therefore of major importance. Results: We develop a maximum-likelihood approach for the estimation of unknown parameters of metabolic network models that relies on the integration of statistical priors to compensate for the missing data. In the context of the linlog metabolic modeling framework, we implement the identification method by an Expectation-Maximization (EM) algorithm and by a simpler direct numerical optimization method. We evaluate performance of our methods by comparison to existing approaches, and show that our EM method provides the best results over a variety of simulated scenarios. We then apply the EM algorithm to a real problem, the identification of a model for the Escherichia coli central carbon metabolism, based on challenging experimental data from the literature. This leads to promising results and allows us to highlight critical identification issues.

Identification of metabolic network models from incomplete high-throughput datasets / S. Berthoumieux, M. Brilli, H. de Jong, D. Kahn, E. Cinquemani. - In: BIOINFORMATICS. - ISSN 1367-4803. - 27:13(2011), pp. i186-i195. ((Intervento presentato al convegno 19. Annual International Conference on Intelligent Systems for Molecular Biology/10. European Conference on Computational Biology tenutosi a Wien nel 2011.

Identification of metabolic network models from incomplete high-throughput datasets

Berthoumieux S;M. Brilli;de Jong H;Kahn D;Cinquemani E

2011

Abstract

Motivation: High-throughput measurement techniques for metabolism and gene expression provide a wealth of information for the identification of metabolic network models. Yet, missing observations scattered over the dataset restrict the number of effectively available datapoints and make classical regression techniques inaccurate or inapplicable. Thorough exploitation of the data by identification techniques that explicitly cope with missing observations is therefore of major importance. Results: We develop a maximum-likelihood approach for the estimation of unknown parameters of metabolic network models that relies on the integration of statistical priors to compensate for the missing data. In the context of the linlog metabolic modeling framework, we implement the identification method by an Expectation-Maximization (EM) algorithm and by a simpler direct numerical optimization method. We evaluate performance of our methods by comparison to existing approaches, and show that our EM method provides the best results over a variety of simulated scenarios. We then apply the EM algorithm to a real problem, the identification of a model for the Escherichia coli central carbon metabolism, based on challenging experimental data from the literature. This leads to promising results and allows us to highlight critical identification issues.

Scheda breve

Scheda completa

Scheda completa (DC)

	Parole chiave
	
				Escherichia-coli; linlog kinetics; missing data; identifiability analysis; likelihood; redesign
			
	Settori scientifico-disciplinari dell'articolo (sola visualizzazione)
	
				Settore BIO/19 - Microbiologia Generale
Settore BIO/10 - Biochimica
			
	Data di pubblicazione
	
				2011
			
	Rivista in ANCE
	
				BIOINFORMATICS
			
	DOI
	
				https://dx.doi.org/10.1093/bioinformatics/btr225
			
	Tipologia
	
				Article (author)
			
	Appare nelle tipologie:
	
				01 - Articolo su periodico

File in questo prodotto:

File	Dimensione	Formato
btr225.pdf accesso aperto Tipologia: Publisher's version/PDF Dimensione 458.01 kB Formato Adobe PDF Visualizza/Apri	458.01 kB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/621268

Citazioni

5

20

18

ND

IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca