INTRODUCTION
Neuropsychiatric disorders affect approximately 30% of the population worldwide1-3. Schizophrenia (SCZ) and dementia, often related, are two of the most common neuropsychiatric diseases4, and epidemiological studies indicate that patients diagnosed with SCZ present a 2-fold increased risk for dementia compared to non-schizophrenic individuals5. Unfortunately, the etiology of these complex diseases remains to be fully elucidated. Genome-wide association studies (GWAS) have contributed to explain approximately 12% of phenotypic variation of these complicated disorders, including SCZ and dementia6,7, showing an apparent missing heritability8. One approach to find this missing heritability is to investigate rare highly-damaging (RHdv) and novel variants (Nv) which are not routinely considered in GWAS analyses. Several research groups have undertaken this quest using next-generation sequencing (NGS)9. One limitation could be that RHdv and Nv are potentially population-specific10,11. The collection of genetic variation in Mexican populations is still an ongoing and incipient endeavor, particularly for RHdv and Nv12. This study aimed to explore by NGS the presence of novel and damaging variants for 184 genes in 19 Mexican patients diagnosed with dementia or SCZ.
METHODS
Study population
Nineteen individuals from the Geriatric Clinic at the Psychiatric Hospital Fray Bernardino Alvarez and the Group of Medical and Family Studies Carracci in Mexico City, Mexico, were invited between 2011 and 2013 to participate. Of them, seven were diagnosed with late-onset dementia of probable Alzheimers type (DAT) and 12 with SCZ. All patients were invited to participate and signed informed consent. The study protocol complied with the Helsinki Declaration and was approved by the Ethics and Research Committee at the National Institute of Genomic Medicine (No. IMG/DI/136/2014).
DAT patients filled a demographic questionnaire and were evaluated by a geriatric psychiatrist at the Psychiatric Hospital Fray Bernardino Alvarez. Dementia was diagnosed based on the DSM-IVR criteria since our study group found memory impairment and at least one other cortical function affected13. All the patients had a family history of Alzheimers disease in at least one, first, or second degree relative, and fulfilled the criteria for probable Alzheimers diagnosis according to the National Institute of Neurological Disorders and Stroke and the Alzheimers Disease Related Disorders Association14. The patients were evaluated using the following scales: mini-mental state examination (MMSE), NEUROPSI, clock-drawing test, DIPAD, and the clinical dementia rating15-20.
Patients with paranoid SCZ were recruited from the Group of Medical Studies Carracci; all patients had a family history of at least one-, first-, or second-degree relative diagnosed with SCZ. Patients were evaluated with a diagnostic interview for genetic studies21, which is a structured interview, including the disorders contained in the Axis I of the DSM-IVR. In this respect, little changes have been made in the latest version of DSM for SCZ diagnosis22. Furthermore, when the medical record of the patient was available, we included a structured sequence of the response to the consumed medications. We established criteria for treatment-resistance, as previously published23. Positive and negative symptoms were evaluated with SAPS and SANS scales, and cognitive function was evaluated with the MMSE24. APOE-E4 variant is the most extensively validated among the genetic markers associated with cognitive decline. To consider this variation, all the included individuals (i.e. 7 DAT and 12 SCZ) were negative for the E4 allele of the APOE; the APOE status was determined by real-time PCR, as previously described25.
Targeted NGS
Genomic DNA was extracted from peripheral leukocytes using the Gentra Puregene commercial kit (QIAGEN, USA). We designed synthetic probes for NGS, targeting genes associated with dementia, SCZ, and several pharmacogenetic targets. The selection of genes was based on a literature search for published works reporting an effect of common variations or rare variants in SCZ, dementia or drug response to different antipsychotics or antidementia drugs6,7,26-36; a list of the captured genes is reported in Supplementary Table 1. Gene capture was performed using the Haloplex target enrichment system (Agilent Technologies, USA) with 1.51Mb with 40754 amplicons. Sequencing libraries were generated according to the manufacturers protocol (version D.5, May 2013). Briefly, all DNA samples (a total of 225 ng for each sample) were digested with 8-paired restriction enzymes; fragmentation pattern was analyzed in a 2100 Bioanalyzer (Agilent Technologies, USA). DNA fragments were hybridized with Haloplex synthetic probes for library enrichment, and adapters were ligated by PCR. Then, library qualities for fragment size and concentration were assessed using a 2100 Bioanalyzer, as previously described37. Sequencing was performed using a NextSeq500 system (Illumina, USA), aiming for 200x depth coverage in paired-end reads.
Variant | dbSNP | Gene | Reference MAFA | MendelianB | Complexc |
---|---|---|---|---|---|
Missense variants | |||||
NP_004792.1: p.Pro108Ala | rs199784029 | NRXN1 | 0.0008 | Pitt-Hopkins-like syndrome-255,56 | Autism spectrum disorder60-64 and schizophrenia65 |
NP_742054.1: p.Val33Met | rs765679790 | KCNH2 | 0.000008 | Long QT Syndrome 257,58 | Schizophrenia treatment response66,67 and lower intellectual coefficient in schizophrenia68 |
NP_000515.2: p.Ala155Gly | rs145641566 | HTR1A | 0.0005 | Periodic fever, menstrual cycle-dependent59 | Alcohol and nicotine dependence69,70 and Alzheimers disease with alcohol dependence comorbidity69 |
NP_001748.1: p.Gly195Arg | rs146758729 | CBR1 | 0.0015 | NR | Drug toxicity71,72 |
Non-coding variants | |||||
NT_187607.1: g.1782677C>T | rs28363996 | ABCC1 | 0.0003 | NR | Drug resistance73-76 |
AReference MAF: Minor allele frequency reported in the GnomAD or the 1000 Genomes Project.
BLoF variant reported to be disease-causing of Mendelian inheritance disorder.
CCommon or rare variants reported to be associated to neuropsychiatric disorder. NR: Not reported.
Bioinformatic analyses
First, for quality control, we utilized trimmomatic to eliminate reads with a quality score Phred-QS <25 and length below 55 bp; indexes, adaptors, and 5 bp at both read ends were trimmed according to general practices38. We then aligned reads to the human genome using BWA39 and SMALT with GRCh37/hg19 as reference40. InDel realignment, base recalibration, and variant calling were done following the GATK best-practices recommendations41,42. HaplotypeCaller was used for SNV detection, and copy number variants (CNV) were detected using the pipeline implemented by XHMM43. A total of 1274 variants were called by both aligners, which were used for the following analyses. Variants were confirmed visually in the integrative genomic viewer IGV, and also, annotated using dbSNP version 14737,44.
Analysis of rare and novel damaging variants selection
Variants were registered if detected in at least one SCZ or DAT patient, as heterozygous or homozygous. Variants were annotated utilizing different databases including: dbSNP, OMIC, ClinVar, GnomAD, rebuild, and 1000 Genomes, with Variant Effect Predictor45, allowing the prediction of the functional impact, with queries to different algorithms and databases (SIFT, Polyphen-2, FATHMM, CADD, gene splicer, and splice region)46-53. As possible pathogenic variants, we selected loss-of-function (LoF) variants (frame shift, stop gained, splice-site acceptor, and splice-site donor) and missense variants if the three algorithms predicted the variants to be damaging (i.e., SIFT, FATHMM, and polyphen-2), and coding synonymous variants and non-coding variants were selected if the CADD score was higher than 25 (CADD). After filtering these variants, we included all the Nv, and for previously reported ones, we only included rare mutations (minor allele frequency <0.1%) using the Genome Aggregation Consortium (GnomAD) and the 1000 Genomes projects databases as reference for population allelic frequency. ClinVar, OMIM (Online Mendelian Inheritance in Man), and an own search in PubMed databases were used as reference for the clinical significance and disease-associated variants. Furthermore, a novel variation (Nv) was considered when it had not been reported. We used the Human Genome Variation Society (HGVS) nomenclature using the web-tool mutalyzer54, and included the rs dbSNP (version 147) identifier for the nonNv.
RESULTS
Summary of the total detected variants in the sample
Bioinformatic analyses detected 1274 variants on 184 genes, with an average depth of 96x (range: 55X-120X), and 91.2% coverage. Of these, 1148 were SNVs, 126 indels, and only one CNV on RELN gene. A total of 149 variants (11.7%) were located in coding regions and 1125 (88.3%) in non-coding regions. Frequency analyses showed that more than half of all variants (735 variants) were common (minor allele frequency >5%). In total, we also identified 86 Nv not previously reported. The genes with the highest number of Nv were PTGER3 (21 Nv), SLC6A3 (5 Nv), and ADD1 (5 Nv).
Rare and Nv in patients with DAT
In three of seven DAT patients (42.9%), we detected five damaging variants in five genes (NRXN1, HTR1A, KCNH2, CBR1, and ABCC1) (Table 1). Novel or LoF variations were not observed. Four variants were missense: NRXN1 (p.Pro108Ser), HTR1A (p.Ala155Gly), KCNH2 (p.Val33Met), and CBR1 (p.Gly195Arg), and one intronic ABCC1 (g.1782677C>T). LoF variation in three genes (NRXN1, KCNH2, and HTR1A) has been reported to be causal of some syndromes with Mendelian inheritance type (Pitt-Hopkins-like syndrome-2, Long QT Syndrome 2, and menstrual cycle-dependent periodic fever), while CBR1 and ABCC1 have been reported in drug response. Furthermore, common variation in genes NRXN1 and KCNH2 has been previously associated to neuropsychiatric disorders (SCZ, autism spectrum disorder, and drug abuse and dependence), and only common variation on HTR1A has been previously associated to DAT. One single DAT patient, DAT 1, carried three of the seven damaging variants, on NRXN1, KCNH2, and ABCC1. This patient obtained the lowest scores in the MMSE = 7 (i.e., affecting almost all his cognitive areas). A summary of some sociodemographic and clinical characteristics of patients carrying the variants is shown in Supplementary Table 2.
Variant | dbSNP | Gene | Reference MAFA | MendelianB | Complexc |
---|---|---|---|---|---|
LoF | |||||
NP_001139.3: p.Thr3457Hisfs | rs750143580 | ANK2 | 0.00006 | Long QT syndrome 477 and Cardiac Arrythmia (Ankyrin-B-related)78 | Bipolar disorder with binge-eating89 |
NC_000007.13: g.103130984_ 103474463del | NR | RELN | NR | Lissencephaly 279,80 and Familial Temporal Lobe Epilepsy 781 | Schizophrenia, autism spectrum disorder90 and Alzheimers disease91,92 |
NP_059488.2:p. Pro488Thrfs | rs67666821 | CYP3A4 | 0.0002 | NR | Treatment response in schizophrenia28 |
NC_000010.10: g.92617169_ 92617170ins | NR | HTR7 | NR | NR | Alzheimers disease93 |
Missense variants | |||||
NP_001158010.1: p.Arg418His | rs144959108 | DISC1 | 0.0007 | NR | Schizophrenia94,95 and Alzheimers disease70 |
NP_001062.1: p.Gly246Ala | NR | TYMS | NR | NR | Alzheimers disease96 |
NP_000758.1: p.Lys139Glu | rs12721655 | CYP2B6 | 0.0023 | NR | Nicotine dependence97 |
Coding synonymous and non-coding variants | |||||
NP_001305298.1: p.Ala282= | rs77029901 | SLC6A5 | 0.0003 | Hyperekplexia 382,83 | Schizophrenia98 |
NP_005948.3: p.Thr139= | rs2066466 | MTHFR | 0.0032 | Homocystinuria due to MTHFR deficiency84 | Neural tube defects99,100 and schizophrenia101,102. |
NC_000012.11:g. 13769306G>A | NR | GRIN2B | NR | Autosomal dominant mental retardation85-87 and early infantile epileptic encephalopathy85,88 | Schizophrenia and autism spectrum disorder103 |
NC_000008.10:g. 32405771T>C | NR | NRG1 | NR | NR | Schizophrenia61,104-107 |
NC_000011.9: g.27722838A>G | rs79141432 | BDNF | 0.0024 | NR | Schizophrenia108 |
NC_000012.11: g.16228314T>C | NR | ABCC1 | NR | NR | Drug resistance73-76 |
AReference MAF: Minor allele frequency reported in the GnomAD or in the 1000 Genomes Project.
BLoF variant reported to be disease-causing of Mendelian inheritance disorder.
CCommon or rare variants reported to be associated to neuropsychiatric disorders. NR: No reported.
LoF: Loss-of-function variants.
Rare and Nv in patients with SCZ
In schizophrenic patients, we identified 13 variants on 13 genes: ANK2, CYP3A4, RELN, HTR7, DISC1, TYMS, CYP2B6, MTHFR, NRG1, SLC6A5, BDNF, GRIN2B, and ABCC1 (Table 2). Of these, four were LoF on ANK2, CYP3A4, RELN, and HTR7; three were missense on DISC1, TYMS, and CYP2B6; and six were coding synonymous or non-codng on MTHFR, NRG1, SLC6A5, BDNF, GRIN2B, and ABCC1. We identified six Nv, which represented almost half of all variants detected for this patient group. In these patients, 10 of the 12 (83.33%) included individuals was a carrier of a damaging variant. Previously, LoF variants in ANK2, RELN, SLC6A5, MTHFR, and GRIN2B have been reported to cause syndromes with Mendelian inheritance (Table 2). Interestingly, the patient carrier of the variants in DISC1 had the lowest cognitive function (mini-mental state = 15), and a patient carrier of the LoF in CYP3A4 had treatment-resistant SCZ. A summary of genetic variations and clinical and sociodemographic data of patients with SCZ are presented in Supplementary Table 3.
DISCUSSION
Here, we present a next-generation genome sequencing analysis to explore the existence of rare and novel damaging variants in patients with SCZ or DAT. Clearly, one of the main limitations of this study is the low number of patients included. However, as an exploratory study, we obtained interesting results that could prompt future studies with larger sample sizes. To the best of our knowledge, there are no reports using NGS to identify rare and novel gene variation for neuropsychiatric disorders in Mexican patients.
Our analyses showed that almost 10% of the targeted genes were carriers of one rare or novel damaging variant. For example, genes coding for drug-metabolizing enzymes (DME) (CBR1, CYP3A4, TYMS, CYP2B6, and MTHFR), and genes involved in neurodevelopmental processes (ANK2, RELN, DISC1, NRNX1, NRG1, and BDNF) were the two main pathways observed in this study with relevant variation in these patients. Variants on genes ANK2, RELN, and NRNX1 have been associated with some syndromes with Mendelian inheritance affecting neurodevelopmental mechanisms, which suggests that they may have a strong influence on the etiology of DAT or SCZ. The overall effect of these variants on the etiology of neuropsychiatric disorders is still under study, although some hypotheses have been proposed. For instance, a recent WES and WGS analysis of neuropsychiatric patients has proposed that an increase of damaging variants on these genes could decrease the age of Alzheimers onset109, and that the age of onset of SCZ and autism-spectrum disorders could be influenced by the accumulation of de novo variants in genes involved in neurodevelopmental processes110,111.
The effect of DME on brain processes has been understudied. Nevertheless, some, including CYP1A, CYP2B, CYP2C, and CYP3A, have been functionally linked to brain development112,113. Our observations regarding DME include that among schizophrenic patients, two were carriers of the CYP3A4*20 (rs67666821) allele as homozygous, and this variant was present in a patient with treatment-resistant SCZ23. CYP3A4*20 is an allele previously identified in the Brazilian population114, and it has been found at high allele frequency in the Spanish population (minor allele frequency = 0.012)115, but at low frequency in other European populations. This allele has been reported to affect the metabolism of clozapine, also associated with treatment-resistant SCZ116.
In relation to carriage of damaging variants in neurodevelopmental genes that could affect SCZ and cognitive ability, two patients diagnosed with SCZ were carriers of the DISC1 missense rare variant (p.Arg418His) and clearly manifested a cognitive disability. DISC1 gene has been involved in the neurodevelopmental process and the development of normal cognitive function31. The product of this gene is greatly involved in brain cortex development, including symmetry and orientation of neurons117-120. Furthermore, a common variation in the DISC1 gene has been associated with Alzheimers disease, reinforcing the notion that this gene could have a strong effect on cognitive development.
An interesting finding was that ABCC1 (ATP-binding cassette, subfamily C, and member 1 gene) was the only gene where two patients in each group shared a variant. The patient diagnosed with DAT who was a carrier of the ABCC1 variant had a rapid cognitive decline, with severe manifestations of cognitive impairment. Likewise, the patient diagnosed with SCZ and was a carrier of a variant in this gene had a cognitive disability, mainly affecting memory function. ABCC1 has previously been implicated in the increased accumulation of amyloid-β, dependent on its expression in a mouse model of early Alzheimers disease121. However, the effect of the observed novel and rare damaging variants in disease etiology would be under the scope of future studies. The development of NGS technologies has enabled the screening of many genetic variants, finding a large number that has not been previously reported. The substantial number of Nv found makes impractical to functionally validate each one; in this sense, computer methods have been developed to anticipate the effect of a variant at the molecular level. Here, we presented a sequencing data analysis utilizing different algorithms to prioritize the damaging effect of variants. We focused on those with a higher impact on disease etiology, based on distinct algorithms.
Our results may be limited by the small sample size; however, we explored genetic variation in 184 genes previously associated with neurodegenerative diseases and drug treatment. We located some rare and novel damaging variants on 18 genes formerly known to be involved in neuropsychiatric disorders in a Mexican population, and we discussed their potential role in these diseases. Future endeavors should focus on validating these observations.