Projects

We are intersted in applying and developing computational methods and latest genomics technologies to dig into the omics data, to reveal the unknown functions of the human genome and their roles in neurological diseases.

Parkinson Brain Atlas: Deconstructing proximal disease mechanisms across cells, space, and progression

​Genome-wide association studies (GWAS) have unequivocally linked thousands of noncoding variants to Parkinson’s disease (PD). Why have these breakthroughs not uncovered the mechanism(s) of common, genetically complex PD? We do not know how GWAS variants cause neurodegeneration, which variants are causal, and why they impair some brain cells but not others. Overwhelming evidence shows that cis-regulation of transcription is the most likely mediator of disease risk, and that it is finely tuned by cellular and dynamic context. Here we will pinpoint the causal gene(s) through which GWAS loci function in spatially barcoded, single brain cells and, dynamically, over pseudotime using single-nucleus expression Quantitative Trait Locus (eQTL) analysis. Putative causal genes will be tested mechanistically in cell- and stage-specific analyses in vivo in Drosophila avatars and in vitro in human pluripotent stem cells. Moreover, we will identify the corresponding causal GWAS variants through allele-specific expression and CRISPR/Cas9-variant editing in single brain cells. This collaborative and highly integrated project will begin to reveal the complex human genetics of PD through a dynamic, multi-dimensional view of proximal cellular mechanisms across brain cells, brain space, and disease stage.

Funding: ASAP (PIs: Scherzer, Levin, Dong, Feany, Zhang), 2021-2024, $9M in total

​As part of the ASAP CRN, our team has generated large-scale multi-model datasets, incl. scRNAseq, scATACseq, spatial transcriptomics, genomics, and clinical data for two brain regions (MTG and midbrain) of 100 human subjects (incl. healthy control, prodromal, and PD). The rich datasets from this project, together with other shared omics data from ASAP CRN, can be used to answer various biological questions related to Parkinson’s disease. We are looking for computational talents who are passinate with genomics and neurosceince and join us to explore the data together.

PDMAP: Systematic study of extracellular vesicles and their integrative analysis with Parkinson’s organoids

It has been 25 years since the first genetic cause of PD was identified, and yet there is still no effective treatment for the disease. One of the hinders we think is the lack of models that assess early PD pathogenesis and therapy responses in its real neurophysiological environment. This provides a significant bottleneck in our ability to make progress in this disease. Two lines of recent evidence motivate us to study PD pathogenesis in a real neurophysiological environment: (1) Human neuroimaging data and animal models both showed that synaptic disruption proceeds neuronal death, making the case that PD is a synaptopathy. (2) Many novel, regulatory, non-coding RNAs show linkage to PD pathogenesis. For instance, we found over 20,000 enhancer RNAs (or eRNAs) candidates in dopamine neurons of human post-mortem brains (Dong et al. Nature Neuroscience, 2018). They significantly co-localized with PD risk variants. The other class of novel RNAs is circular RNAs (circRNAs), which are predominantly enriched in the brain, highly specific to the synapse, and ultra-stable (e.g., 10x longer half-life than linear RNAs). We identified >11,000 circRNAs actively expressed in the dopamine neurons, many of which are significantly associated with PD pathology (Dong et al. Nature Communications, 2023). More importantly, circRNAs can form a regulatory network with lncRNAs and miRNAs, and can be wrapped into extracellular vesicles (EV), penetrating blood-brain barriers. Based on these, we hypothesize that regulatory RNAs incl. circRNAs, eRNAs, miRNAs, lncRNAs can be detected in EV and might play a role in the synaptic dysfunction in early PD pathogenesis.

In this study, we teamed up with Dr. Luke Lee at Harvard Medical School, and combined our expertise in brain organoids, PD biology, exosome analysis, single-cell omics, bioinformatics, and biomedical engineering to develop a new 3D brain organoids microphysiological analysis platform (MAP) to recapitulate the dopamine neurons’ interconnectivity and study molecular neurodegeneration systematically. We will (1) first develop PD organoids and profile the transcriptome (incl. circRNAs, miRNAs, mRNAs, lncRNAs, etc.) of secreted EV and single-cell transcriptome of brain organoids, to identify PD-associated RNAs, then (2) map the pathophysiological dynamics of PD organoids in a novel, high-throughput, mini-brain-on-chip platform, and last will (3) integrate the EV-organoid temporal multi-dimensional data to infer the PD-associated RNAs and their regulatory dynamics during the PD pathogenesis.

Funding: NIH R01 (PIs: Lee, Dong), 2021-2026, $5M in total

Harvard PRECISION Human Pain Center - Data Core

Chronic pain affects >25 million Americans per year, with enormous impacts on both quality of life and productivity. Despite advances in our understanding of nociception in animal models, new and effective treatments for patients with chronic pain have been lacking. The poor translation between mouse and human pain targets has highlighted limitations of animal models of pain. Recent advances in single-cell genomics and physiology directly in human tissue position pain researchers to make important new advances with improved opportunities for clinical translation. The Harvard PRECISION Human Pain Center proposes to leverage state-of-the-art single-cell technologies to characterize human nociceptor subtypes and how their gene expression patterns vary across diverse populations (Project 1) as well as in chronic pain conditions that clearly localize to these cells – chronic phantom limb pain associated with painful neuromas (Project 2). The Projects will generate a wealth of data for the scientific community by closely integrating with 5 Cores tasked with 1) procuring high-quality human pain-related tissues following strict regulatory practices, 2) offering the latest single-cell gene multi-omic technologies, 3) performing advanced single-cell spatial transcriptomic analysis, 4) managing, integrating, and distributing all of the data, and 5) administrating and connecting to Center the other PRECISION Human Pain Network Centers. As part of the NIH HEAL Initiative, the data generated by our Center will contribute to this broader PRECISION Human Pain Network and to help identify and prioritizate of novel pain therapeutic targets for future investigation.

As one of the key components of the Center, the Data Core aims to act as a data center by building data infrastructure, providing data management, coordination, analysis, and sharing to the investigators within the Center and other centers in the community. The projects will generate nearly 4,000 datasets of clinical (pain scores, demographics), single-cell RNA-seq, single-cell ATAC-seq, SNP genotyping, spatial transcriptomics from MERFISH, and physiology for over 1,700 samples from 150 donors.

Funding: NIH U19 (Lead PI: Renthal) - Data Core (PI: Dong), 2022-2027, $13M in total

LINE1 RNA dysregulation in ALS/FTD

Dysfunction of RNA metabolism has emerged to play crucial roles in multiple neurodegeneration diseases, including frontal temporal dementia (FTD) and amyotrophic lateral sclerosis (ALS). The key pathologic hallmark of both diseases is the nuclear clearance and cytosolic aggregation of the RNA binding protein (RBP) TAR DNA binding protein-43 (TDP-43), which is found in up to 50% FTD patients and 97% ALS patients. In addition, TDP-43 proteinopathy also occurs in 20-60% Alzheimer’s Disease (AD) cases. TDP-43 has multiple functions in mRNA processing. It is also implicated in regulating retrotransposon activation, but the molecular mechanism is not resolved.

Retrotransposon elements are mobile genetic elements that copy themselves by transcribing into RNA, reverse-transcribed into DNA and then inserted into new sites in the genome, a process known as retrotransposition. Long interspersed nuclear element-1 (LINE1) is the only currently active, autonomous family of retrotransposon elements in human, and accounts for ~20% of the human genome. Only a small subset of LINE1s are thought to be mobile. The majority are inactive due to truncations, rearrangements and point mutations. There are increasing interest in understanding the “noncoding RNA” functions of LINE1 RNA in chromatin state regulation.

In this collaborative project with Dr Sun’s team at John Hopkin University, we aim to decipher the molecular mechanism of LINE1 RNA dysregulation, particularly the dysfunction of LINE1 RNA decay pathway caused by TDP-43 loss of function. We will determine the causal relationship of LINE1 RNA elevation with chromatin accessibility, histone modification and transcriptional gene network disruption. We will combine human induced pluripotent stem cell-differentiated neurons and patient-derived postmortem tissues to dissect the molecular mechanisms and associate with TDP- 43 proteiopathy. Our proposed study will provide deeper mechanistic understanding of the retroelement dysregulation and its role in functional genomics, which is largely understudied in the past. The findings will help understanding the disease mechanisms and facilitating therapy development in neurodegeneration diseases.

Funding: NIH R01 (PIs: Sun, Dong), 2022-2027, $5M in total

Other unsponsored but interesting projects

More to life than gene expression

figure image

Other RNA processing than gene expression may play critical roles in brain diseases, such as splicing, polyadenylation, RNA modification, editing etc. We explore them in both bulk and single-cell RNAseq data.

Intron size matters!

figure image

Human neuronal genes tend to have very long introns. Why? What’s the unknown role of long intron? We try to address it from various angles, such as evolution, transcription, and gene regulation etc.

RNAs transported to synapse

figure image

Neuronal mRNAs are transported to synapse for local translation. But what about those non-coding RNAs such as circRNAs enriched in synapse? How did they get there? What do they do there?