While the human genome project revealed an unexpectedly small fraction of the genome dedicated to protein-coding genes, the ENCODE and related projects revealed that at least 70% of the genome is actively transcribed. This led to the discovery of a new class of RNAs known as long non-coding RNAs or lncRNAs. Rapidly accumulating evidence strongly suggests that these lncRNAs have important autonomous activities as RNAs. The emerging functions are predominantly in development, differentiation and pluripotency, processes with critical links to human health. Thus, establishing an understanding of the mechanism of action of lncRNAs is a high priority frontier in biology. Towards the long term-goal of producing a molecular understanding of lncRNA function, we are using a set of biochemical and structural approaches to elucidate how lncRNAs organize aspects of the nucleus to regulate and coordinate chromatin expression.

Nuclear Organization via hnRNP U

The molecular basis of the interaction between hnRNP U and the Xist and Firre RNAs will be investigated. These lncRNAs are essential for X-chromosome inactivation (Xist) and adipogenesis (Firre), using mechanisms requiring their direct interaction with hnRNP U that localizes the lncRNA to the correct chromosome or loci. Selective binding of the RNA-binding RGG domain of hnRNP U will be explored using a combination of "bottom-up" and "top-down" biochemical approaches and structural approaches to understand the molecular details of protein-RNA recognition, focusing on the critical but poorly understood RGG domain. The results of these studies will illuminate a number of RNA-protein interactions mediated by RGG domains that are central to human RNA metabolism.

Molecular Decoys

The second portion of the project focuses on the decoy/guide model for lncRNA function by investigating the interactions of two key transcription factors, glucocorticoid receptor (GR) and Sox2, with the lncRNAs that are proposed to modulate the specificity and regulatory activity of these proteins. According to this model, pervasive transcription at promoters and enhancers can either titrate away a transcription factor from its dsDNA-binding site or, conversely, help to recruit and localize a transcription factor to a target. To reveal the sequences and/or structures of lncRNAs that these dsDNA-binding proteins can recognize, an in vitro selection based approach will be used and the results correlated with transcriptomic studies. In parallel, traditional biochemical and structural approaches will be used to understand the molecular details of these protein-RNA interactions with known lncRNA targets. These studies will yield direct insights into how key transcription factors that have classically been regarded as solely dsDNA-binding factors also interact with RNA as a critical part of their gene regulatory mechanism.