Background: ChIP-seq experiments are widely used to detect and study DNA-protein interactions, such as transcription factor binding and chromatin modifications. However, downstream analysis of ChIP-seq data is currently restricted to the evaluation of signal intensity and the detection of enriched regions (peaks) in the genome. Other features of peak shape are almost always neglected, despite the remarkable differences shown by ChIP-seq for different proteins, as well as by distinct regions in a single experiment. Results: We hypothesize that statistically significant differences in peak shape might have a functional role and a biological meaning. Thus, we design five indices able to summarize peak shapes and we employ multivariate clustering techniques to divide peaks into groups according to both their complexity and the intensity of their coverage function. In addition, our novel analysis pipeline employs a range of statistical and bioinformatics techniques to relate the obtained peak shapes to several independent genomic datasets, including other genome-wide protein-DNA maps and gene expression experiments. To clarify the meaning of peak shape, we apply our methodology to the study of the erythroid transcription factor GATA-1 in K562 cell line and in megakaryocytes. Conclusions: Our study demonstrates that ChIP-seq profiles include information regarding the binding of other proteins beside the one used for precipitation. In particular, peak shape provides new insights into cooperative transcriptional regulation and is correlated to gene expression.

Peak shape clustering reveals biological insights / M.A. Cremona, L.M. Sangalli, S. Vantini, G.I. Dellino, P.G. Pelicci, P. Secchi, L. Riva. - In: BMC BIOINFORMATICS. - ISSN 1471-2105. - 16:1(2015). [10.1186/s12859-015-0787-6]

Peak shape clustering reveals biological insights

G.I. Dellino;P.G. Pelicci;
2015

Abstract

Background: ChIP-seq experiments are widely used to detect and study DNA-protein interactions, such as transcription factor binding and chromatin modifications. However, downstream analysis of ChIP-seq data is currently restricted to the evaluation of signal intensity and the detection of enriched regions (peaks) in the genome. Other features of peak shape are almost always neglected, despite the remarkable differences shown by ChIP-seq for different proteins, as well as by distinct regions in a single experiment. Results: We hypothesize that statistically significant differences in peak shape might have a functional role and a biological meaning. Thus, we design five indices able to summarize peak shapes and we employ multivariate clustering techniques to divide peaks into groups according to both their complexity and the intensity of their coverage function. In addition, our novel analysis pipeline employs a range of statistical and bioinformatics techniques to relate the obtained peak shapes to several independent genomic datasets, including other genome-wide protein-DNA maps and gene expression experiments. To clarify the meaning of peak shape, we apply our methodology to the study of the erythroid transcription factor GATA-1 in K562 cell line and in megakaryocytes. Conclusions: Our study demonstrates that ChIP-seq profiles include information regarding the binding of other proteins beside the one used for precipitation. In particular, peak shape provides new insights into cooperative transcriptional regulation and is correlated to gene expression.
ChIP-seq; GATA-1; peak shape; transcription regulation; chromatin immunoprecipitation; cluster analysis; DNA; GATA1 transcription factor; gene knockdown techniques; humans; K562 cells; megakaryocytes; protein binding; sequence analysis, DNA; computational biology; applied mathematics; structural biology; biochemistry; molecular biology; computer science applications1707 computer vision and pattern recognition
Settore MED/04 - Patologia Generale
2015
Article (author)
File in questo prodotto:
File Dimensione Formato  
12859_2015_Article_787.pdf

accesso aperto

Tipologia: Publisher's version/PDF
Dimensione 11.4 MB
Formato Adobe PDF
11.4 MB Adobe PDF Visualizza/Apri
Pelicci_BMCBioinformatics_PeakShape_2015_Parte1.pdf

accesso aperto

Tipologia: Publisher's version/PDF
Dimensione 7.68 MB
Formato Adobe PDF
7.68 MB Adobe PDF Visualizza/Apri
Pelicci_BMCBioinformatics_PeakShape_2015_Parte2.pdf

accesso aperto

Tipologia: Publisher's version/PDF
Dimensione 3.21 MB
Formato Adobe PDF
3.21 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/425047
Citazioni
  • ???jsp.display-item.citation.pmc??? 10
  • Scopus 15
  • ???jsp.display-item.citation.isi??? 16
social impact