F49. DIFFERENTIAL DNA METHYLATION ENABLES ACCURATE PREDICTION FOR C9ORF72 PATHOGENIC REPEAT EXPANSION
European Neuropsychopharmacology(2024)
UC San Diego School of Medicine
Abstract
Background The “GGGGCC” (G4C2) repeat expansion in the promoter region of C9orf72 is the most common genetic cause of Frontotemporal Dementia (FTD) and Amyotrophic Lateral Sclerosis (ALS). Previous studies have identified DNA hypermethylation in the C9orf72 promoter region among subjects with high repeat expansion. Given these prior findings, we conducted a genome-wide DNA methylation (DNAm) association study to identify differentially methylated probes associated with C9orf72 expansion status and applied these results to build a predictive regression model to identify repeat expansion carriers. Methods DNAm data was collected from 27 individuals with a pathogenic G4C2 repeat expansion (mean age = 62.2 years, SD 6.7) and 250 individuals without the expansion (mean age = 63.3 years, SD 8.4) from an FTD case/control cohort recruited in the Netherlands. DNAm data was collected using the Illumina EPICv2 array, which assays the DNAm status of 937,055 CpG sites. After QC, we performed genome-wide analysis to identify differentially methylated probes between subjects with and without the C9orf72 repeat expansion. DNAm values for each CpG probe were regressed against repeat expansion status. Covariates included age, sex, DNAm-derived cellular composition, DNAm-derived smoking score, and experimental batch. To predict C9orf72 expansion status, Least Absolute Shrinkage and Selection Operator (LASSO) regression was applied to 23 CpG probes within the C9orf72 gene and 1 kilobase upstream and downstream. To account for prediction variability due to data splitting, we randomly split the data 100 times. Each iteration involved training a model on 70 % of the data and testing its accuracy on the remaining 30%. Results Genome-wide analysis yielded 9 CpG probes that were significantly associated with repeat expansion status, with 8 of these probes located near the repeat locus. These probes showed hypermethylation in repeat expansion carriers. The LASSO model identified nine CpG probes that were used to predict C9orf72 repeat expansion carriers with an average accuracy of 98.6%. Restricting the model to only CpGs in the EPICv1 array also yielded average accuracy of 98.6%. Applying the LASSO model to CpGs on the Methyl450K DNAm array resulted in diminished accuracy (average Type I error rate of 63.4%). In another available cohort of 2,600 subjects with DNAm EPICv2 array data, we identified four C9orf72 repeat expansion carriers using the DNAm model. We are working to validate our findings by using repeat-primed PCR for this group and reviewing their phenotype data for neurological symptoms. Discussion Our results revealed significant methylation differences within the C9orf72 locus between repeat carriers and non-carriers. The high accuracy of this model suggests the potential of methylation as a biomarker for identifying C9orf72 expansions in clinical settings and in large DNAm cohort studies. Our study highlights the potential use of DNAm biomarkers for the detection of pathogenic repeat expansions and/or genomic variants.
MoreTranslated text
求助PDF
上传PDF
View via Publisher
AI Read Science
AI Summary
AI Summary is the key point extracted automatically understanding the full text of the paper, including the background, methods, results, conclusions, icons and other key content, so that you can get the outline of the paper at a glance.
Example
Background
Key content
Introduction
Methods
Results
Related work
Fund
Key content
- Pretraining has recently greatly promoted the development of natural language processing (NLP)
- We show that M6 outperforms the baselines in multimodal downstream tasks, and the large M6 with 10 parameters can reach a better performance
- We propose a method called M6 that is able to process information of multiple modalities and perform both single-modal and cross-modal understanding and generation
- The model is scaled to large model with 10 billion parameters with sophisticated deployment, and the 10 -parameter M6-large is the largest pretrained model in Chinese
- Experimental results show that our proposed M6 outperforms the baseline in a number of downstream tasks concerning both single modality and multiple modalities We will continue the pretraining of extremely large models by increasing data to explore the limit of its performance
Upload PDF to Generate Summary
Must-Reading Tree
Example

Generate MRT to find the research sequence of this paper
Data Disclaimer
The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn
Chat Paper
Summary is being generated by the instructions you defined