AI Deep Learning Meets Enhancers: A Deep Learning Model Designs Enhancers for Single Cell Targeting

AI Deep Learning Meets Enhancers: A Deep Learning Model Designs Enhancers for Single Cell Targeting

Table of Contents

Introduction

In eukaryotic cells, the expression of genes is controlled through proximal regulatory elements, such as promoters, in the DNA that contain binding sites for transcription factors that modulate chromatin remodeling to facilitate access to the transcriptional machinery.  However, in addition to proximal regulatory elements, regulatory regions can also be found large distances away (up to millions of bases away or on different chromosomes altogether) from genes. These regulatory regions, called ‘enhancers’ can mediate transcriptional activity of associated gene(s). Interestingly, some enhancers become transcribed upon activation signals, producing long non-coding RNAs (lncRNAs) called enhancer RNAs (eRNAs). While the exact role of eRNAs is debated, recent studies suggest they induce chromatin conformation changes, facilitating a physical chromatin looping interaction between the enhancer and the target gene promoter. This looping brings together essential players like TFs, coregulators, and RNA Pol II at the enhancer with the basal transcriptional machinery at the promoter. Disruption of either eRNA production or this looping mechanism can hinder gene transcription.

Active or poised enhancers typically reside in open chromatin regions, allowing for optimal TF and coregulator accessibility. This characteristic allows researchers to identify putative enhancers by searching for clusters of TF binding motifs or highly conserved sequences outside annotated genes. However, the presence of such motifs or conserved sequences doesn’t guarantee enhancer activity, as many TF binding motifs might not be functionally bound, or their binding might be cell-type specific.

Despite these significant advancements in an understanding of enhancer biology, deciphering the intricate regulatory code within enhancers and its impact on shaping gene expression patterns remains a challenge. In a recent publication in Nature, Ibrahim et al. (2023) utilizes deep learning to achieve a paradigm shift in enhancer design. Their approach effectively generates synthetic, cell-type-specific enhancers, thereby opening doors to the exploration of enhancer features with unprecedented detail. Furthermore, they validated the functionality of these synthetic enhancers in vivo, demonstrating their ability to target specific cell types within the fruit fly brain.

AI Deep Learning Deciphers the Enhancer Code: Designing Single Cell Regulators for Melanoma Cancer Treatment

Cellular identity hinges not only on genetic transcription but also on the nuanced orchestration of transcriptional enhancers and the differential expression of transcription factors. These enhancer codes play a pivotal role in delineating cell types and functions. A better understanding of the enhancer codes is not only scientifically important but also holds significant promise for elucidating noncoding genome variations and devising cell-specific therapies. In a recent publication, Ibrahim et al. (2023) leveraged deep learning models and integrative genomics to understand enhancer codes, with a particular focus on their role in melanoma, a cancer type characterized by diverse cellular states.
            In order to improve their understanding of the enhancer code, Ibrahim et al. (2023)  developed Deep melanocyte-like melanoma 2 (DeepMEL2), a deep learning model trained on DNA sequences and chromatin accessibility data obtained from a range of human patient-derived cell lines. DeepMEL2 proved to be a robust tool, accurately predicting enhancer functions and revealing insights into the underlying architecture and transcription factor binding sites within enhancers. Expanding their scope, the researchers delved into enhancer codes across various species, shedding light on the conservation and evolutionary dynamics of these regulatory elements. These findings advance our understanding of enhancer turnover and highlighted the impact of nucleotide substitutions on enhancer functionality.          

Shifting focus to a more complex system, Ibrahim et al. (2023) directed their attention to the Drosophila brain, utilizing single-cell chromatin accessibility data to train DeepFlyBrain. This model provided valuable insights into the code of cell type-specific neuronal and glial enhancers, thereby deciphering the regulatory diversity within neuronal populations. In culmination, the researchers explored synthetic enhancer design, leveraging their deep learning models to fabricate functional enhancers targeting specific cell types. Through directed sequence evolution, motif implantation, and generative design strategies, they demonstrated the potential to engineer enhancers tailored for diverse biological contexts, spanning from fruit fly brains to human cancer cells. This approach suggests a generalizable method for manipulating gene expression across species. This work paves the way for a deeper understanding of enhancer biology and opens exciting new avenues for targeted gene regulation.

Outsourcing Bioinformatics Analysis: How Bridge Informatics (BI) Can Help

At Bridge Informatics, we are passionate about empowering life science companies with the latest and most advanced technologies, including large language models (LLM) inspired tools, such as GPTs, to ensure they stay at the forefront of their fields. BI’s data scientists prioritize studying, understanding, and reporting on the latest developments so we can advise our clients confidently. Our bioinformaticians are trained bench biologists, so they understand the biological questions driving your computational analysis.

From pipeline development and software engineering to deploying your existing bioinformatic tools, BI can help you on every step of your research journey. As experts across data types from leading sequencing platforms, we can help you tackle the challenging computational tasks of storing, analyzing and interpreting genomic and transcriptomic data. Click here to schedule a free introductory call with a member of our team.


Haider M. Hassan, Data Scientist, Bridge Informatics

Haider is one of our premier data scientists. He provides bioinformatic services to clients, including high throughput sequencing, data pre-processing, analysis, and custom pipeline development. Drawing on his rich experience with a variety of high-throughput sequencing technologies, Haider analyzes transcriptional (spatial and single-cell), epigenetic, and genetic landscapes.Before joining Bridge Informatics, Haider was a Postdoctoral Associate at the London Regional Cancer Centre in Ontario, Canada. During his postdoc, he investigated the epigenetics of late-onset liver cancer using murine and human models. Haider holds a Ph.D. in biochemistry from Western University, where he studied the molecular mechanisms behind oncogenesis. Haider still lives in Ontario and enjoys spending his spare time visiting local parks. If you’re interested in reaching out, please email [email protected] or [email protected]


Share this article with a friend