1. Introduction
DNA methylation is one of the many epigenetic modifications associated with genetic expression in higher eukaryotes, regulating the processes of DNA transcription and peptide synthesis due to chromatin remodelling. The covalent modification is attributed to the catalytic action of DNA methyltransferases (DNMTs) [1], enzymes which can either promote de novo methylation (DNMT3A and DNMT3B) or maintenance me thy lation (DNMT1), following DNA replication. To date, only these three types of DNMTs with catalytic activity have been identified in the human genome [2], whereas the fourth type, namely DNMT3L, is a highly homologous nuclear protein that lacks the amino acid residues required for catalytic activity.
Regular function of DNMTs enzymes are crucial for DNA methylation in both cellular development and mitosis. However, overexpression of DNMTs can lead to aberrant methylation patterns [3], which have been associated with many cancer types including lung, colorectal, prostate, breast, cervical and pancreatic cancer, among others [4-6]. Experimentally, it has been observed that DNMT inhibition reactivates silenced (hypermethylated) genes, particularly tumor-suppressor genes [7]. Hence, DNMTs are promising biological targets for the design of anti-cancer agents. Also, DNMTs are becoming relevant targets in the pathology of other diseases such as diabetes, obesity [8], Alzheimer’s disease [9], and other central nervous system disorders [10].
De novo methylation is the main activity of the DNMT3A and DNMT3B enzymes [11], enabling key epigenetic modifications essential for processes such as cellular differentiation and embryonic development, transcriptional regulation, heterochromatin formation, X-chromosome inactivation, imprinting, and genome stability. Albeit very similar in structure, these two enzymes appear to have different roles in mammals [12].
In 2015 Medina-Franco et al., reviewed inhibitors of DNMTs with emphasis on inhibitors of DNMT1. In that work, the authors discussed different in-silico studies used in the development of DNMTs. Examples of techniques discussed were molecular docking, virtual screening, pharmacophore model, molecular dynamics, and similarity searching [13]. In 2016 Prieto-Martínez, et al. discussed advances in computational approaches applied mostly to DNMT1 [14]. However, there are no reviews focused on the computational development on DNMT3A and 3B inhibitors. This is because the development of inhibitors of DNMT3A and DNMT3B has been slower. In this work, we summarize recent developments of small molecule inhibitors of DNMT3A and 3B as potential therapeutic agents or molecular probes. The review focuses on the role of computational methods that have contributed to such developments.
2. DNMTs: structure and mechanism
Mammalian DNMTs have a high level of homology, possessing a large N-terminal region of variable size and a C-terminal catalytic portion. Whereas the N-terminal region encodes regulatory functions, the C-terminal region is involved in both cofactor binding and substrate catalysis. These enzymes rely on S-adenosyl-L-methionine (AdoMet or SAM) as a methyl group donor and contain several highly conserved structural features such as the SAM binding pocket, the DNA-cytosine binding pocket, and a vicinal proline-cysteine pair important for the reaction mechanism. However, there are structural differences between DNMTs: while DNMT1 relies on its N-terminal domain [15] to recognize hemimethylated strands (hence its function in maintaining the methylation status), DNMT3A and DNMT3B do not show any preference about the methylation status of the DNA, rendering them as de novo transferases. The 3D coordinates of DNMTs with different domains have already been deposited in the Protein Data Bank (PDB) [16]. Based on the structural information, mostly obtained from X-ray crystallography, it is plausible to consider at least three types of inhibitors, namely; (a) molecules binding in the co-factor binding pocket (allosteric inhibition), (b) molecules binding in the DNA binding pocket (competitive inhibition), and (c) molecules interacting directly with the catalytic cysteine (irreversible inhibition). Fig. 1 shows the catalytic domains of DNMT3A and DNMT3B. For illustration purposes, this figure shows the DNA-binding domain of human and bacteria DNMT3A and 3B, respectively.
3. Modulators of DNMT3A and DNMT3B
In order to explore the chemical space of the current inhibitors a molecular database of inhibitors of DNMT3A and DNMT3B was assembled following a previously published protocol [17]. Six public databases and two sources of scientific literature were explored. The public databases explored were (1) ChEMBL [18], (2) Therapeutic Target Database (TTD) [19], and (3) ChEpiMod [20] using the query text ’DNMT3A’ or ’DNMT3B’; (4) HEMD [21] with the information located in the enzyme browser, option DNA and submenu DNA (cytosine-5)-methyltransferase 3A or 3B; (5) Binding Database [22] searching in the IC50 menu, submenu DNA methyltransferase A or B; and (6) ChromoHub [23] clicking on DNMT, option ‘browse inhibitors’. To retrieve additional compounds not reported in those public databases, we also reviewed the scientific literature using Web of Science (webofscience.com) and Chemical Abstracts (scifinder.cas.org). After data curation, 269 unique structures with reported activity were identified. It was found that ChEMBL, HEMD and Chemical Abstracts contained the information provided by the other chemical databases, therefore only these three sources are shown in Table 1. The analysis of the database allowed the identification of two classifications of inhibitors: (a) nucleosidic or (b) non-nucleosidic. Fig. 2 shows representative chemical structures of molecules that have been proposed as nucleoside and non-nucleoside inhibitors.
Chemoinformatic tools enable the management and mining of the chemical information in compound databases [24]. Computer-assisted molecular design methods have been used to calculate the distribution of molecular properties of inhibitors of DNMT1 [17,25]. Therefore, it is warranted the chemoinformatic analysis of modulators of DNMT3A/3B. These analyses contribute to the ongoing effort of charting the epigenetic-relevant chemical space (ERCS) [25].
4. Role of computational methods towards the development of DNMTs
Computational methods have become a corner stone in the development of bioactive compounds [26]. The in silico development of inhibitors of DNMT is not exception as demonstrated by the emerging ‘Epi-informatics’ field [27]. Computer-aided approaches can be divided in two major groups depending on the experimental information used. Structure-based approaches rely on the availability of three-dimensional information of the molecular target, whereas the ligand-based approaches depend mostly on the structure-activity information of small molecule modulators. In this section, we discuss the progress on computational studies towards the development of DNMT3A/3B inhibitors. Several methods can be classified as structure-based methods but representative ligand-based approaches have been developed as well.
4.1 Structure-based analysis of modulators of DNMT3A/3B
As of December 2016, the PDB contained 15 structures associated with DNMTs enzymes of Homo sapiens: 12 for DNMT3A and three for DNMT3B (Table 2).
SAH (S-Adenosyl-L-Homocysteine), EDO (Ethylene glycol), Zn (Zinc ion), BTB (Bis-Tris Buffer), SO4 (Sulfate ion), MLY (N-Dimethyl-lysine), SFG (Sinefungin), TPO (Phosphothreonine), UNX (Unknown Atom or ion), GOL (Glycerol), M3L (N-Trimethyllysine).
A number of studies have employed the structure of DNMT3A (PDB ID: 2QRV, 4U7P, 4U7T) as a starting point for the construction of a complex between DNMT3A and SAH [13,28-31]. Furthermore, several authors have reported homologous structures of DNMT3B obtained from genetic modifications of the structure of DNMT3A. This is because there is no report of a crystallographic structure that includes the catalytic domain of human DNMT3B [31-35]. These studies highlight the relevance of a set of amino acids such as Ser, Gly, Glu, Cys, Pro, Lys and Arg, showing the affinity of inhibitors against DNMT3B [30,33,36].
Erez-Rechavi et al. recently reported a clinical study coupled with a bioinformatics analysis in which expressed a mutation in the protein structure of DNMT3B. It was found that the structural constraints of Ala585 prevent the structure from undergoing mutations or changes in amino acids largely because this alanine residue has a small size. The small number of mutations in this region retains its high degree of structural similarity, which optimizes the binding site of the cofactor and the catalytic efficiency [37].
In 2014 Sheng-Chao et al., reported that antroquinonol D, a ubiquinone derivative, may act as a selective DNMT1 inhibitor. In the study, a docking model of antroquinonol D with DNMT1 was superimposed with a crystallographic structure of DNMT3A and with a model structure of DNMT3B. It was observed that antroquinol D could not fit in the cavity of binding site of DNMT3B. These results provided a rationale of the observed low inhibition of antroquinol D in DNMT3B [35].
In a different study, Kuck et al. also overlapped the structures of DNMT1 and DNMT3B and found that the Gln89 residue in DNMT1 is located towards the interior of the binding site, while the corresponding residue in DNMT3B (Asn652) is located about 4 Å outside the binding pocket. Authors of that work concluded that Gln89 acts as a hydrogen bond donor available to bind to inhibitors. In contrast, in the DNMT3B active site this hydrogen bonding interaction is absent [32]. Combinations of structural domains of DNMTs have been reported that emphasize the importance of arginine residues in the interaction of DNA with DNMT3A [38].
4.2 Molecular docking and dynamics
The study of active DNMT3A and DNMT3B inhibitors using molecular docking has helped to explore, at the molecular level, the protein-ligand interactions associated with compound affinity and, in some cases, selectivity.
Induced-fit docking (IFD) has been conducted with DNMT3A focusing the analysis on the binding site of the cofactor SAH. The results have shown that inhibitors could occupy the binding site of the cofactor SAH [13,29,31,39]. In particular, hydrogen bonds are formed in the catalytic site between ligands and Val754, Phe636, Thr641, Glu660, Val661, Val683, Arg684, Ser704, Cys706, Asn711, Leu726, Glu752, Arg883, Leu884, and Arg887 [29,39]. Some of these interactions are like those observed with SAH and Arg684, Thr641 y Glu660.
Fahad-Aldawsari et al. performed a docking study of resveratrol analogues with DNMT3B. Authors concluded that the synthetic analogues of the natural product exhibited π-π type of interactions with Trp889 and Trp834 in DNMT3A and DNMT3B, respectively [30] and hydrogen bonds interactions with the amino acids of the pocket: Ser111, Gly112, Arg157, Arg193, Pro650, Gly697, Arg731, Arg733, Lys828, Gly831, Arg832 and Cys651. In particular, the hydrogen bond with the catalytic Cys651 could prevent nucleophilic attack of cysteine to the target cytosine. The authors discussed that these hydrogen bonds play a key role in the stabilization of the protein-ligand complex, which could suppress the function of DNMT3B [30,32,33]. Fahad-Aldawsari et al. showed that, in the presence of the cofactor, resveratrol analogues make interactions with the catalytic cysteine Cys706, Glu752, and Arg788 in DNMT3A; and with Cys651, Glu697 and Arg733 in DNMT3B. The binding energy calculated with docking for resveratrol analogues were favored for DNMT3A and DNMT3B but not for DNMT1. This result was in agreement with the observed experimental selectivity [30].
The presence of water molecules in the binding site of the cofactor seems to play an important role. To further investigate the role of water molecules, molecular dynamics have been reported allowing optimizing the selectivity and binding affinity of the hit to lead molecules against DNMTs. Evans et al., pointed out that the conformational entropy of proteins increases the binding affinity of DNMT with its cofactor which is assisted by SAM. When the structure of the protein becomes rigid, the entropic contribution of the cofactor decreases [34].
Caulfield et al. reported a molecular dynamics study of the inhibitor nanaomycin A (Fig. 2) with DNMT3B. The study of nanaomycin A was performed in the presentence and absence of the cofactor SAM. The energy profiles showed less stability for the complexes without SAM (not interactions with water molecules) in contrast to those simulations performed in the presence of SAM. Authors also concluded that in the SAM-DNMT3B complex the presence of water molecules favor binding of nanaomycin A with the thiol group of Cys651. Nanaomycin A also exhibited interactions with the amino acids Arg731, Arg733, Arg832, and Cys651 [36].
4.3 Virtual screening
Virtual screening of chemical libraries, followed by experimental validation of hit compounds is a technique commonly used in drug discovery. Virtual screening has been used to identify hit molecules that directly or indirectly have led to the identification of inhibitors of DNMT3A/3B. For instance, Kuck, et al., reported a virtual screening of a collection of more than 65,000 compounds from the National Cancer Institute [32]. For that study, authors used lead-like filters and molecular docking. One of the hit compounds was later used as a starting point of a hit optimization program reported few years later by Kabro et al., These authors developed two new compounds 49 and 50 (using the labeling in the original papers), analogous of the virtual screening hit NSC319745 (Fig. 2) as inhibitors of DNMT3A [40].
Maldonado-Rojas et al. performed a virtual screening of natural products in three main steps: (1) QSAR based on Linear Discriminant Analysis (LDA), (2) molecular docking, and (3) cluster analysis. Six natural products with new scaffolds were selected as virtual DNMTi hits: 9, 10-dihydro-12-hydroxygambogic acid, phloridzin, 2, 4-dihydroxychalcone 4-glucoside, daunorubicin, pyrromycin, and centaurein (Fig. 3). Experimental testing of the hit compounds is warranted [39].
The increasing amount of structural data available for the catalytic domain of human DNMT3A/3B encourage the continued identification of novel inhibitors using structure-based virtual screening of compound databases followed by experimental testing.
4.3.1 Ligand-based virtual screening
Ligand-based virtual screening can be used to select series of molecules or compounds that have shown activity against a receptor target. In order to identify new inhibitors of DNMT3A, Shao et al., performed an analogous search using ligand-based virtual screening of the SPECS database. Compound 40 (using the labeling in the original papers, Fig. 2) was used as reference. Pharmacophore-based mapping and molecular docking were combined to conduct the search. Two compounds 40_3 and 40_8 (Fig. 2) were identified as effective inhibitors [31].
In an independent study, Méndez-Lucio et al., conducted three-dimensional similarity searching using NSC14778 (Fig. 2) as reference. Authors identified olsalazine (Fig. 3) as an experimentally validated hypomethylating agent with potential to be an inhibitor of DNMT. The later because of its high structural similarity compared to NSC14778 [33]. It remains to conduct the enzymatic inhibition assays to test if this approved anti-inflammatory drug is selective towards DNMT3A and/or 3B [33].
4.4 Structure-activity relationships (SAR)
Quantitative approaches have been used to identify descriptors associated with the activity or specificity of compounds. For instance, as commented above, Maldonado-Rojas et al. performed a QSAR study based on LDA to select natural products with putative activity against DNMTs. For that study, authors selected a total of 47 compounds dividing them into a training and test sets. Six different types of molecular descriptors, calculated with the software Dragon 5.5, were included in the model: information index (SIC2), 3DmoRSE descriptor (Mor13m), 2D autocorrelation (GATS5m), Randi´c molecular profile (SHP2), topological descriptor (ZM2V), and atomic centered-fragment (H-047). The QSAR model had a R= 0.94, Q= 9.9, F(6,25)= 16.262 and p < 0.00001 [39]. The model was later used to predict the activity of new compounds with potential active molecules (vide supra). Overall, it is anticipated that the increasing amount of structure-activity data published for DNMT3A/3B (Table 1) can be used to continue developing quantitative models to explore SAR.
5. Conclusions and future directions
In the past few years, there has been a significant upsurge in information correlating the structure and function(s) of DNMT3 enzymes, with the structure-activity relationships defining the potency and selectivity for DNMT3A and DNMT3B inhibitors. Some of this information is publicly available in several databases, including the Protein Data Bank and small-molecule databases such as ChEMBL, Binding Database, and other epigenetic-specific databases such as ChromoHub and Human Epigenetic Enzyme and Modulator Database. Consequently, the complexity associated with the analysis of large sets of data on the wide variety of DNMT3A/3B inhibitors requires the chemoinformatic characterization of the chemical space, including the quantification of its structural diversity. This effort will continue to expand the characterization of the ERCS that has been already started [25]. Structural data of DNMT3 is a key component in the computational-assisted design of inhibitors of DNMT3A/B. The combination of computational screening, with medicinal chemistry-guided optimization and experimental validation, has led to the identification of novel and specific inhibitors of DNMT3. These inhibitors may be promising candidates for the therapeutic treatment of a number of diseases, including cancer, or as molecular probes. It is anticipated that the increasing amount of SAR data will speed up the development of new and more specific inhibitors. It is also expected that the increasing amount of SAR will favor the development of qualitative and quantitative models such as activity landscape models and QSAR predictive models.