In mammalian cells transcription factors (TFs) bind only to a small fraction of the available consensus sites in the genome. In particular, they prefer sites embedded in regions of computationally predicted high nucleosomal occupancy. This is compatible with non-exclusive mechanisms of nucleosome-driven TF-binding and nucleosome-mediated masking of TF binding sites, suggesting that TFs, and in particular pioneers, must overcome a strong barrier in order to engage binding. Exploiting the available information for the hematopoietic master regulator Pu.1, we applied machine learning approaches and uncovered the sequence-encoded information that discriminates engaged from non-engaged genomic consensus sites. We identified a minimal set of features which predicts Pu.1 binding with 78% accuracy, among which sequence determinants able to drive nucleosome occupancy were found. Consistent with this, while Pu.1 maintained nucleosome depletion at many thousand cell type-specific enhancers in macrophages, these site are otherwise occupied by nucleosomes in other cell types and in in vitro reconstituted chromatin. As predicted, engaged consensus sites showed higher sequence-encoded nucleosome occupancy compared to the myriad of non-occupied (and likely non-functional) consensus sites that randomly occur in mammalian genomes. The same sequence features selected in machine learning also explains up to 45% of the variability observed in the nucleosome occupancy in cells where Pu.1 is not expressed (a performance equal or better than what achieved by ad hoc models), suggesting that the same information contributes to nucleosome occupancy and positioning. These data reveal a basic organizational principle of mammalian enhancers whereby TF-engagement at its consensus sites and nucleosome occupancy are coordinately controlled by overlapping sequence features. This model also suggests that co-evolution of these features may be crucial to ensure cell-type specific enhancer activation. The nucleosomal patterns at Pu.1-bound sites in macrophages were further characterized, uncovering distinct subtypes with different DNA sequence composition, which mirror distinctive nucleosomal configurations either in the presence or in the absence of Pu.1.

OVERLAPPING SEQUENCE FEATURES OF MAMMALIAN ENHANCERS COORDINATELY CONTROL ENGAGEMENT OF TRANSCRIPTION FACTOR CONSENSUS SITES AND NUCLEOSOMAL OCCUPANCY / I.g. Barozzi ; supervisors: G. Natoli, S. Minucci. DIPARTIMENTO DI BIOSCIENZE, 2014 Mar 25. 25. ciclo, Anno Accademico 2013. [10.13130/barozzi-iros-giacomo_phd2014-03-25].

OVERLAPPING SEQUENCE FEATURES OF MAMMALIAN ENHANCERS COORDINATELY CONTROL ENGAGEMENT OF TRANSCRIPTION FACTOR CONSENSUS SITES AND NUCLEOSOMAL OCCUPANCY

I.G. Barozzi
2014

Abstract

In mammalian cells transcription factors (TFs) bind only to a small fraction of the available consensus sites in the genome. In particular, they prefer sites embedded in regions of computationally predicted high nucleosomal occupancy. This is compatible with non-exclusive mechanisms of nucleosome-driven TF-binding and nucleosome-mediated masking of TF binding sites, suggesting that TFs, and in particular pioneers, must overcome a strong barrier in order to engage binding. Exploiting the available information for the hematopoietic master regulator Pu.1, we applied machine learning approaches and uncovered the sequence-encoded information that discriminates engaged from non-engaged genomic consensus sites. We identified a minimal set of features which predicts Pu.1 binding with 78% accuracy, among which sequence determinants able to drive nucleosome occupancy were found. Consistent with this, while Pu.1 maintained nucleosome depletion at many thousand cell type-specific enhancers in macrophages, these site are otherwise occupied by nucleosomes in other cell types and in in vitro reconstituted chromatin. As predicted, engaged consensus sites showed higher sequence-encoded nucleosome occupancy compared to the myriad of non-occupied (and likely non-functional) consensus sites that randomly occur in mammalian genomes. The same sequence features selected in machine learning also explains up to 45% of the variability observed in the nucleosome occupancy in cells where Pu.1 is not expressed (a performance equal or better than what achieved by ad hoc models), suggesting that the same information contributes to nucleosome occupancy and positioning. These data reveal a basic organizational principle of mammalian enhancers whereby TF-engagement at its consensus sites and nucleosome occupancy are coordinately controlled by overlapping sequence features. This model also suggests that co-evolution of these features may be crucial to ensure cell-type specific enhancer activation. The nucleosomal patterns at Pu.1-bound sites in macrophages were further characterized, uncovering distinct subtypes with different DNA sequence composition, which mirror distinctive nucleosomal configurations either in the presence or in the absence of Pu.1.
25-mar-2014
Settore BIO/10 - Biochimica
NATOLI, GIOACCHINO
MINUCCI, SAVERIO
Doctoral Thesis
OVERLAPPING SEQUENCE FEATURES OF MAMMALIAN ENHANCERS COORDINATELY CONTROL ENGAGEMENT OF TRANSCRIPTION FACTOR CONSENSUS SITES AND NUCLEOSOMAL OCCUPANCY / I.g. Barozzi ; supervisors: G. Natoli, S. Minucci. DIPARTIMENTO DI BIOSCIENZE, 2014 Mar 25. 25. ciclo, Anno Accademico 2013. [10.13130/barozzi-iros-giacomo_phd2014-03-25].
File in questo prodotto:
File Dimensione Formato  
phd_unimi_R08898.pdf

accesso aperto

Tipologia: Tesi di dottorato completa
Dimensione 6.11 MB
Formato Adobe PDF
6.11 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/234131
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact