Integrating Peak Colocalization and Motif Enrichment Analysis for the Discovery of Genome-Wide Regulatory Modules and Transcription Factor Recruitment Rules

Ronzio, M.; Zambelli, F.; Dolfini, D.; Mantovani, R.; Pavesi, G.

doi:10.3389/fgene.2020.00072

Chromatin immunoprecipitation followed by next-generation sequencing (ChIP-Seq) has opened new avenues of research in the genome-wide characterization of regulatory DNA-protein interactions at the genetic and epigenetic level. As a consequence, it has become the de facto standard for studies on the regulation of transcription, and literally thousands of data sets for transcription factors and cofactors in different conditions and species are now available to the scientific community. However, while pipelines and best practices have been established for the analysis of a single experiment, there is still no consensus on the best way to perform an integrated analysis of multiple datasets in the same condition, in order to identify the most relevant and widespread regulatory modules composed by different transcription factors and cofactors. We present here a computational pipeline for this task, that integrates peak summit colocalization, a novel statistical framework for the evaluation of its significance, and motif enrichment analysis. We show examples of its application to ENCODE data, that led to the identification of relevant regulatory modules composed of different factors, as well as the organization on DNA of the binding motifs responsible for their recruitment.

Integrating Peak Colocalization and Motif Enrichment Analysis for the Discovery of Genome-Wide Regulatory Modules and Transcription Factor Recruitment Rules / M. Ronzio, F.Z.. - In: FRONTIERS IN GENETICS. - ISSN 1664-8021. - 11:(2020 Feb 21), pp. 72.1-72.15. [10.3389/fgene.2020.00072]

Integrating Peak Colocalization and Motif Enrichment Analysis for the Discovery of Genome-Wide Regulatory Modules and Transcription Factor Recruitment Rules

M. Ronzio;F. Zambelli;D. Dolfini;R. Mantovani;G. Pavesi

2020

Abstract

Chromatin immunoprecipitation followed by next-generation sequencing (ChIP-Seq) has opened new avenues of research in the genome-wide characterization of regulatory DNA-protein interactions at the genetic and epigenetic level. As a consequence, it has become the de facto standard for studies on the regulation of transcription, and literally thousands of data sets for transcription factors and cofactors in different conditions and species are now available to the scientific community. However, while pipelines and best practices have been established for the analysis of a single experiment, there is still no consensus on the best way to perform an integrated analysis of multiple datasets in the same condition, in order to identify the most relevant and widespread regulatory modules composed by different transcription factors and cofactors. We present here a computational pipeline for this task, that integrates peak summit colocalization, a novel statistical framework for the evaluation of its significance, and motif enrichment analysis. We show examples of its application to ENCODE data, that led to the identification of relevant regulatory modules composed of different factors, as well as the organization on DNA of the binding motifs responsible for their recruitment.

Scheda breve

Scheda completa

Scheda completa (DC)

	Parole chiave
	
				ChIP-seq; colocalization analysis; transcription factor (TF); transcription factor binding sites (TFBS); transcriptional regulation
			
	Settori scientifico-disciplinari dell'articolo (sola visualizzazione)
	
				Settore BIO/18 - Genetica
Settore BIO/11 - Biologia Molecolare
			
	Data di pubblicazione
	
				21-feb-2020
			
	Rivista in ANCE
	
				FRONTIERS IN GENETICS
			
	DOI
	
				https://dx.doi.org/10.3389/fgene.2020.00072
			
	Tipologia
	
				Article (author)
			
	Appare nelle tipologie:
	
				01 - Articolo su periodico

File in questo prodotto:

File	Dimensione	Formato
fgene-11-00072.pdf accesso aperto Tipologia: Publisher's version/PDF Licenza: Creative commons Dimensione 2.47 MB Formato Adobe PDF Visualizza/Apri	2.47 MB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/720016

Citazioni

6

9

9

13

IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca