Structure dictates function in many aspects of biology. In addition to the well-known DNA double-helix made from A, C, T and G nucleotides, sequences rich in G have the potential to fold into four-stranded, alternative structures known as G-quadruplexes (G4s). G4s were...
Structure dictates function in many aspects of biology. In addition to the well-known DNA double-helix made from A, C, T and G nucleotides, sequences rich in G have the potential to fold into four-stranded, alternative structures known as G-quadruplexes (G4s). G4s were initially considered a structural curiosity, but recent evidence suggests their involvement in key genome functions with strong links to neurological and cancer disease. The human genome only codes for a small fraction of sequences with the capability to fold into G4s and of these even less actually form the structure. These G4s have been located to places in the genome known to regulate how genes are expressed (called promoters) particular for genes that are very active. However, very little is known about what proteins might bind these G4 structures to control their formation and how they might regulate genome function through recruiting and modulation of the activity of critical cellular machinery. The objective of this project is to identify and study these so-called Quadruplex-associated proteins (QAPs) in a normal cellular environment by employing small chemical probe molecules and antibodies to isolate QAPs bound to G4s in conjunction with the latest quantitative proteomics and next-generation genome sequencing. The study of G4s and their native binding partners is an urgent priority to gain fundamental insights into the cellular regulation and role of G4s in genome function and to advance the understanding of G4-related disease biology.
I explored experiments to pull-down QAPs using tagged small molecule G4 ligands but this proved to have limited efficiency, probably due to competition with endogenous QAPs for the same binding sites. In an alternative antibody-based approach, I adapted and optimized a recently developed method to locate G4 in the cellular genome called G4 ChIP-seq. The advanced protocol was used to map G4 landscapes in four different cancer cell lines and revealed conserved sites as well as substantial cell-type specificity. Furthermore, combining this with proteomics technologies revealed the enrichment of various proteins involved in gene expression and the organisation of DNA in the nucleus (so called transcription factors and chromatin-binding proteins) at G4 sites.
In addition I investigated a potential link between G4 formation and DNA methylation. In mammals, addition of a methyl group to cytosine is a fundamental ‘epigenetic’ mechanism used to control gene expression and inheritance. This methylation occurs predominantly at CpG dinucleotides and is installed and maintained by enzymes called DNA methyltransferases (DNMTs), which plays a critical role for gene regulation. Most CpGs tend to be highly methylated, but regions rich in CG denisty, so-called CpG islands (CGIs), are mostly depleted of DNA methylation. I showed that DNMT1 selectively binds to DNA G4s with considerably higher affinity than double strand-DNA substrates and that this interaction inactivates the enzymatic activity in vitro. In human chromatin, DNMT1 is preferentially located at CGIs low in methylation and suggests a model by which G4 structures recruit and sequester DNMT1 to shape the epigenetic landscape such that G4s protect CGIs from DNA methylation to promote gene expression
A study on the direct interaction of transcription factors and G4 structures is currently still on-going and will provide new mechanistic details on how G4s directly contribute to the regulation of gene expression and highlight potential for chemical intervention in cancer therapy.
So far, the main findings have be disseminated in two peer-review publications in Nature Protocols (doi: 10.1038/nprot.2017.150) and in Nature Structural and Molecular Biology (doi: 10.1038/s41594-018-0131-8) as well as in multiple seminar talks. Genomic data acquired from next-generation-sequencing was deposited in the Gene Expression Omnibus (GEO) repository under GSE99205 and GSE107690.
In this project, I developed and further optimized methods to study DNA G4 formation and associated proteins in a normal cellular (chromatin) context. This is crucial as the local chromatin environment significantly contributes to the accessibility of G4 structures to proteins and changes in the chromatin landscape during processes such as development or in diseases such as cancer can influence how G4s are formed. By mapping the positions of G4 formation in four different cancers cell lines I uncovered substantial cell-type specificity, which suggests that G4s could be explored as potential biomarkers for cancer therapy.
I validated several QAPs that had been predicted to bind G4s based on in vitro experiments and identified and characterized multiple new QAPs. I also have preliminary evidence that will rewrite our understanding of how transcription factors are recruit to their gene targets. Another highlight is that I uncovered a direct interaction with DNMT1 and revealed a functional link between G4 formation and DNA methylation. This finding is a step-change for the field by providing strong evidence for a previously unknown functional link between G4s and epigenetics. In summary, my findings provide a novel perspective on the relationships between G4s, transcription factors, epigenetic markers and transcription and will inspire further studies to unravel underlying mechanistic details. Future studies will investigate the potential of G4 small molecule ligands to directly modulate G4-dependent expression of cancer-related genes as a future therapeutic strategy for cancer.