Celebrating Prof. Víctor M. Loyola Vargas career.
In our work, we preserve the vigour and passion with which you taught us to live in the laboratory.
Introduction
The plant family Cactaceae comprises around 1400 succulent and non-succulent species distributed throughout the entire American continent [1]. In Mexico, more than 150 species of cacti are used by indigenous people, and at least 50 of them are cultivated [2], some species such as Opuntia ficus-indica, Hylocereus undatus, and Acanthocereus tetragonus, stand out for their economic importance as producers of food, forage, and dyes [3-6]. Recently, there has been an increasing interest in searching for metabolites in some cactus species in order to apply that knowledge in the industry. The analysis of methanolic extracts of crude and cooked stems of A. tetragonus -using ultra-high-performance liquid chromatography coupled with high-resolution mass spectrometry indicated the presence of carboxylic acids such as threonic, citric, and malic acids, phenolic acids, and glycosylated flavonoids (luteolin-O-rutinoside). The proportions of citric acid found within these compounds are noteworthy because this organic acid is highly demanded by the food and beverages, pharmaceutical, and personal care industries [5]. HPLC analysis results in three Opuntia species (O. ficus-indica, O. undulata, and O. stricta) detected ascorbic acid, flavonoids (quercetin, isorhamnetin, myricetin, kaempferol, and luteolin), betalains, taurine, total carotenoids, and total phenolics. Another study showed the presence of flavonoids in four cultivars of O. ficus-indica [6]. The family Cactaceae has been little explored at the genomic and transcriptomic level despite the proven commercial importance of some species in the family. There are reports of transcriptomic studies made in Melocactus glaucescens, Mammillaria bombycina, and Lophophora williamsii. In these studies, the objectives were to discover metabolic changes during in vitro shoot organogenesis induction [7], study responses to abiotic stress [8], and search for genes putatively involved in mescaline biosynthesis [9]. Bioinformatic analysis is a tool that could allow us to predict the presence of protein-coding genes -including enzymes- and to know how these genes are functionally related in the different metabolic pathways of an organism. Using this strategy, we may be able to identify genes putatively involved in biosynthetic pathways of interest to produce metabolites that are important in the industry and, in the future, to improve the production of compounds of commercial interest through genetic engineering strategies [10,11].
Because of the local economic and ecological importance of the family Cactaceae in Mexico and the American continent, its species could be excellent models for searching for genes encoding enzymes involved in secondary metabolism. There needs to be more exploration of genes involved in secondary metabolism in cacti. Public databases such as that of the National Center for Biotechnology Information (NCBI) contain little information about genes annotated and identified in members of the family Cactaceae. However, the genomes of Carnegiea gigantea, Cereus fernambucensis, Lophocereus schottii, Pachycereus pringlei, Pereskia humboldtii, Selenicereus undatus, and Stenocereus thurberi have been sequenced [12] and can be explored to identify genes putatively involved in the secondary metabolism of cacti. Cacti have ecological importance due to their ability to survive in extreme environments, high temperatures, geographical areas with little rainfall, and on arid soils [13]. CAM metabolism and C3 or C4 carbon fixation are known to be present in members of the cactus family [1]. The seven cacti species that we analyzed in this work have received recent attention aimed at obtaining bioactive compounds from them [14,15]. Cereus jamacaru contains a large diversity of biological compounds, including alkaloids, steroids, triterpenes, glycosides, oils, and waxes used by the pharmaceutical industry [14]. Lophocereus schottii is used in Mexico for traditional medicine cancer treatment. The polar fraction of its ethanolic extract contains flavonoids, alkaloids, terpenoids, and sterols with a proven cytotoxic effect on L5178Y cells and splenocytes stimulated with mitogen concanavalin A (ConA) [15]. In Central America and Mexico, the fruits of Hylocereus undatus, H. costaricensis, H. polyrhizus, and Selenicereus megalanthus -dragon fruit or pitahayas in Mexico- are traded and consumed by people [16]. Studies aimed at identifying bioactive compounds in this genus have indicated that the ethanolic crude extract of the fruit of Selenicereus undatus helps to normalize glucose homeostasis in mice, suggesting that the active compounds may be helpful in the management of diabetes mellitus [17]. A study comparing the metabolomic profiles of Hylocereus polyrhizus and Hylocereus undatus emphasized the value as raw matter in the pharmaceutical and food industries of the bio-residual skins of the pitahaya due to the presence in them of active compounds (L-tyrosine, L-valine, DL-norvaline, tryptophan, γ-linolenic acid, and isorhamnetin-3-O-neohesperidoside) [16]. Alkaloids are one of the most commercially successful groups of molecules because of their antioxidant, antibacterial, antiparasitic, insecticidal, anticorrosive, and antiplasmodial activities [18,19]. Regarding genomic studies of cactus aimed at identifying genes encoding enzymes related to metabolic pathways, only two reports are available, one in which the H. undatus genome was annotated and compared with that of Carnegiea gigantea, finding that nearly 29,000 protein-coding genes were similar between both genomes [12], and another one that identified genes coding for enzymes involved in the betacyanin biosynthetic pathway -which in the future might be used in genetic engineering to improve fruit colour in species with white (Hylocereus undatus) or yellow (Selenicereus megalanthus) pulp [9]- and identified the genes involved in mescaline biosynthesis using de novo sequencing and transcriptome analysis in Lophophora williamsii.
Our review of the literature revealed that cacti genomes have not been extensively explored to identify genes involved in secondary metabolism at the genetic level. Therefore, in this research we annotated genes putatively encoding enzymes involved in secondary metabolism in Carnegiea gigantea, Cereus fernambucensis, Lophocereus schottii, Pachycereus pringlei, Pereskia humboldtii, Selenicereus undatus, and Stenocereus thurberi.
Experimental
Materials and methods
Phylogenetic analysis of the family Cactaceae
To know the evolutionary relationships between members of the family Cactaceae, in this study, we analyzed 25 partial amino acids (aa) sequences of the large subunit of ribulose-1,5-bisphosphate carboxylase/oxygenase (EC 4.1.1.39; RuBisco). The evolutionary history was inferred using the Minimal Evolution (ME) method [20]. Evolutionary analyses were performed on MEGA11 [21]. The percentage of replicate trees in which associated taxa clustered in the bootstrap test (1000 replicates) is shown below the branches [22]. Evolutionary distances were calculated using the Poisson correction method [23] and expressed in units of the number of aa substitutions per site. The ME tree was searched using the Close-Neighbor-Interchange (CNI) algorithm at a search level of 1 [24]. The neighbor-joining algorithm [25] was used to generate the initial tree. All ambiguous positions were removed for each pair of sequences (pairwise removal option). There was a total of 460 positions in the final data set.
Assembly and annotation of the genome of Cereus fernambucensis
The genome of Cereus fernambucensis was obtained from the NCBI database (https://www.ncbi.nlm.nih.gov/data-hub/genome/?taxon=308225). The set of sequences was made with DRAM and the annotations, with the DomainAnnotation v1.0 program that identifies protein domains from domain libraries. For this, the comparison of the C. fernambucensis genome was made with all the domain libraries of the platform of the Knowledge base of the Department of Biology of Energy Systems (KBase). The domain libraries in the KBase include: 1. the NCBI Conserved Domain Database (CDD), COGs (Clusters of Orthologous Groups). 2. NCBI CDD models from NCBI Conserved Domain Database version 3.16 (CDD) [26]. These libraries only contain domains curated by NCBI (including "sd" structural motif models). 3. the SMART (Simple Modular Architecture Research Tool) version 6.0 [27], from CDD. 4. PRK (Protein Clusters version 6.0, from CDD. 4. Pfam version 31.0 hidden Markov models (HMM) [28]. 5. TIGRFAMs version 15.0 HMM [29,30] from the J. Craig Venter Institute. 6. NCBIfam version 1.1 HMM. For the first four libraries mentioned above (COG, CDD, SMART, and PRK), KBase runs RPS-BLAST version 2.2.31 [31] of the BLAST+ package at NCBI [32], identifying all domain accesses with an E value of 10-4 or better. For the three HMM libraries (Pfam, TIGRFAMs, and NCBIfam), KBase runs HMMER version 3.1b2, identifying all domain results at least as significant as the family-specific confidence bound identified by the curators of each model.
Identification of protein domains involved in the secondary metabolism of members of the family Cactaceae
From the annotation of the genome of C. fernambucensis (WGS accession JALPLW01 and GenBank GCA_024363205.1), the enzymes involved in plant secondary metabolism were identified according to their biosynthetic origin in seven functional groups: alkaloids and nitrogenous compounds, flavonoids, terpenes, phenolic compounds, shikimate pathway, ABC-type transporters, and proteins involved in defense against plant pathogens. In order to learn the evolutionary relationships of different enzymes for each functional group, the genomes of C. fernambucensis, Selenicereus undatus, Stenocereus thurberi, Lophocereus schottii, Pachycereus pringlei, Pereskia humboldtii, and Carnegiea gigantea, and other accessions of eudicots and flowering plants from the NCBI database, were compared with the enzymes plant PDR-type ABC transporter family protein (PEN3) from Arabidopsis thaliana, aspartic proteinase-like protein, flavanone 3-hydroxylase (F3H), hydroxycinnamoyl-CoA shikimate quinate hydroxycinnamoyl transferase (HCT), serine threonine-protein phosphatase, and sterol methyltransferase (SMT1). Dendrograms were constructed in MEGA11 with the Minimal Evolution method using the aa sequences of the structural domains of the proteins of interest. For each analysis, the optimal tree is shown in which the percentage of replicate trees in which associated taxa clustered in the bootstrap test (1000 replicates) is shown next to evolutionary branches.
Tertiary structure of enzymes in members of the Cactaceae family
The analyses of the tertiary structure of flavanone 3-hydroxylase (F3H) were obtained from the enzyme's sequences of nucleotides and aa in SWISS-MODEL (https://swissmodel.expasy.org/) [33], considering the sequence identity values (%), Global Model Quality Estimate (GMQE), and Qualitative Model Energy Analysis (QMEAN). Ten plant accessions (Gossypium arboreum (XP_017638014.1), Vernicia fordii (ARV78456.1), Hibiscus sabdariffa (ALB35017.1), Durio zibethinus (XP_022741107.1), Althaea officinalis (UOI87842.1), Dimocarpus longan (ABO48521.1), Arachis hypogaea (XP_025694592.1), Boehmeria nivea (QBC98316.1), Ziziphus jujuba (XP_015889639.1), and Litchi chinensis (ADO95201.1) were included for the comparative analysis of the 3D conformation and the effect of aa substitutions in the 2-oxoglutarate iron-dependent domain. The three-dimensional modeling was created in the Unipro UGENE v. 45.0 platform from the consensus sequence of the regions obtained in the analysis of the genomes of accessions identified as members of the family Cactaceae, and comparison with the reference model described in [34] for anthocyanidin synthase 1gp4.1A in Arabidopsis thaliana.
Results and discussion
Phylogenetic relationships in the family Cactaceae
Among the genera in the family Cactaceae that we analyzed in this work, the genus Cereus had the highest number of accepted and registered taxa with 187 species, 8 subspecies, 12 varieties, and 10 hybrids, followed by Selenicereus (36 species and 6 subspecies), Stenocereus (25 species and 2 subspecies), Pereskia (19 species and 4 subspecies), and, with fewer recorded species, Lophocereus, Pachycereus, and Carnegiea with 7, 4 and 2 species, respectively [35].
In the NCBI database, seven Cacteaceae genomes are available: JALPLW01 (Cereus fernambucensis), JACYFF01 (Selenicereus undatus), NCQT01 (Stenocereus thurberi), NCQV01 (Lophocereus schottii-), NCQS01 (Pachycereus pringlei), NCQU01 (Pereskia humboldtii), and NCQR01 (Carnegiea gigantea). To gain some knowledge on the evolutionary relationships of these species, a phylogenetic analysis was performed using a conserved domain of the small subunit of the enzyme ribulose-1,5-bisphosphate carboxylase/oxygenase (RuBisco) involved in carbon fixation to an organic form (glucose) during the Calvin cycle and in photorespiration. RuBisco catalyzes the initial steps of photosynthetic carbon reduction and photorespiratory carbon oxidation [36,37]. Land plants have an hexadecameric type I RuBisco with an approximate molecular mass of 550 kDa, composed of eight large (L;55 kDa) and eight small (S;15 kDa) subunits (L8S8). In A. thaliana, four distinct small-subunit isoforms of RuBisco are known (RbcS1A, RbcS1B, RbcS2B and RbcS3B) [38].
Our analysis of the phylogenetic relationships of 25 species in the family Cactaceae indicated the formation of two clades, the first one grouped Pereskia humboldtii (NCQU01.1), Pereskia aculeata (YP010365471.1), Pachycereus pringle (NCQS01079793.1), Pereskiopsis diguetti (AFB70629.1), Maihuenia poeppigii (AFB70626.1) and Opuntia sp. (UEK25784.1, QJT42891.1), and the second one included Cereus fernambucensis (JALPLW010162134.1), Selenicereus undatus (YP010023660.1), Lophocereus schottii (YP009590644.1), Carnegiea gigantea (AKR06847.1), and Stenocereus thurberi (UXN84199.1), which grouped together with species of Ferocactus (QTI91393.1, QTI91309.1), Mammillaria (QFG71193.1, QFG71129.1), and Rhipsalis (YP010015700, YP010166122.1). Within this group, we observed that Ferocactus setispinus and Ferocactus latispinus are relatively young species compared to the other accessions in this clade (Fig. 1). The genus Pereskia (Cactaceae) comprises 19 species. Recent phylogenetic work has shown that the species in this genus group into two separate clades, which are basal and paraphyletic with respect to the rest of the cacti [39], therefore it is possible that members of Pereskia -Pereskia humboldtii (NCQU01.1) and Pereskia aculeata (YP010365471.1)- form an independent clade of more closely related cacti. It is interesting to observe that, apparently, Pereskia is more closely related to species of Opuntia (O. sulphurea and O. quimilo). In the other clade it was possible to observe that most of the genomes analyzed in this work (Cereus fernambucensis (JALPLW010162134.1), Selenicereus undatus (YP010023660.1), Lophocereus schottii (YP009590644.1), Carnegiea gigantea (AKR06847.1), Stenocereus thurberi (UXN84199.1), present a closer grouping (Fig. 1). There are few investigations in which the phylogenetic relationship of cacti are based on specific regions of constitutive genes (like RuBisco), which are genes that have demonstrated conservation at the genomic and structural levels, and that allow us to observe groupings associated to phylogeny in various plant genera with C3 and C4 carbon fixation [40, 41]. A study where the phylogeny of the genus Leptocereus (Cactaceae) was reconstructed using chloroplastid markers (trnL-F, trnQ-rps16, psbA-trnH, petL-psbE, and rpl16), indicated that the genus Leptocereus has a paraphyletic origin, and proved to be phylogenetically distant from some members belonging to the genus Selenicereus (S. grandiflorus and S. calcaratus, therefore demonstrating that there is no close resolution among these members of this genus [42].
One of the disadvantages when carrying out this type of molecular analysis is the scant information deposited in public databases, as well as the identification, amplification, and sequencing of specific fragments of the genome of species belonging to the genera of the family Cactaceae. However, derived from the advances in molecular techniques and sequencing platforms, it should be important to highlight the importance of directing studies of phylogenetic reconstructions in cacti through molecular markers. In this study we demonstrate that the use of RuBisco as a molecular marker for cacti phylogeny could be applied in the future, including a greater number of representative species of each genus in the Cactaceae family.
Enzymes involved in the biosynthesis of secondary metabolites in the genome of C. fernambucensis
The genome of C. fernambucensis with the accession WGS JALPLW01 was assembled with N50 the KBase 489.3 Mb platform with 177,285 contigs and 298,226 CDS numbers. It scanned 298,226 protein sequences for 5892,916 characteristic kmers and found 19,252 characteristic kmers and was able to predict 373 enzyme functions for 1,773 protein sequences. This result indicates that for this set of protein sequences, 73 % of the enzymatic functions of the primary metabolism of plants were detected (See Table 1, Supplementary Fig. 1). The reports of enzymes involved in secondary metabolism pointed to the identification of domains cd07816, smart01037, cd10810 of the family of cytoplasmic proteins related to plant pathogenesis (PR-10) -very widespread among dicotyledonous plants, norcoclaurine synthases (NCS), cytokinin-binding proteins (CSBPs), major latex proteins (MLPs), and maturation-related proteins. We also identified CDS of aldehyde reductase, flavonoid reductase of the extended SDR type and related proteins (cd05193, cd08958), related flavonoid reductases acting in the NADP-dependent reduction of flavonoids, ketone-containing secondary plant metabolites, terpene cyclases, isoprenoid biosynthesis enzymes (cd00868) of the class 1 superfamily, and a diverse group of monomeric plant terpene cyclases (Tspa-Tspf) converting the acyclic isoprenoid diphosphates, geranyl diphosphate (GPP), farnesyl diphosphate (FPP), or geranylgeranyl diphosphate (GGPP) into cyclic monoterpenes, diterpenes, or sesquiterpenes (cd00684). We also found CDS of enzymes involved in the biosynthesis of phenolic compounds, such as laccase (cd13897, cd13849) -a blue multicopper oxidase (MCO) which catalyzes the oxidation of a variety aromatic phenolic and inorganic substances coupled to the reduction of molecular oxygen to water, and of the shikimic acid pathway -of great relevance in the biosynthesis of the three aromatic amino acids phenylalanine, tyrosine and tryptophan, as well as a wide range of secondary metabolites (cd00502, COG0169, COG0703, cd01065, PF01488.19). Interestingly, a large number of ABC transport system domains were found (cd03264, cd03235, cd03246, cd03297, cd03300, COG1127, cd03369, cd02079, cd00994, cd02076, cd01431, cd03227, cd06340, cd06349). ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds like sugars, ions, peptides, and more complex organic molecules. This family includes transporters involved in the uptake of various metallic cations such as iron, manganese, and zinc.
Secondary metabolite route | Domain | Conserved Protein Domain Family | Description |
Alkaloids and nitrogenous compounds | cd10810 | Pathogenesis-related protein Bet v I family. | Pathogenesis-related proteins PR-10: These proteins were identified as major tree pollen allergens in birch and related species (hazel, alder), as plant food allergens expressed in high levels in fruits, vegetables and seeds (apple, celery, hazelnut), and as pathogenesis-related proteins whose expression is induced by pathogen infection, wounding, or abiotic stress. Hyp-1 (Q8H1L1), an enzyme involved in the synthesis of the bioactive naphthodianthrone hypericin in St. John's wort ( Hypericum perforatum ) also belongs to this family. Most of these proteins were found in dicotyledonous plants. In addition, related sequences were identified in monocots and conifers. Cytokinin-specific binding proteins: These legume proteins bind cytokinin plant hormones [(PUBMED:9874249)]. (S)-Norcoclaurine synthases are enzymes catalyzing the condensation of dopamine and 4-hydroxyphenylacetaldehyde to (S)-norcoclaurine, the first committed step in the biosynthesis of benzylisoquinoline alkaloids such as morphine (PUBMED:15447655). Major latex proteins and ripening-related proteins are proteins of unknown biological function that were first discovered in the latex of opium poppy ( Papaver somniferum ) and later found to be upregulated during ripening of fruits such as strawberry and cucumber [ (PUBMED:15447655)]. The occurrence of Bet v 1-related proteins is confined to seed plants with the exception of a cytokinin-binding protein from the moss Physcomitrella patens (Q9AXI3). |
cd07816 | Ligand-binding bet_v_1 domain of major pollen allergen of white birch ( Betula verrucosa ), Bet v 1, and related proteins. | This family includes the ligand binding domain of Bet v 1 (the major pollen allergen of white birch, Betula verrucosa ) and related proteins. In addition to birch Bet v 1, this family includes other plant intracellular pathogenesis-related class 10 (PR-10) proteins, norcoclaurine synthases (NCSs), cytokinin binding proteins (CSBPs), major latex proteins (MLPs), and ripening-related proteins. | |
Flavonoids | cd05193 | Aldehyde reductase, flavonoid reductase, and related proteins, extended (e) the serine-aspartate repeat protein family (SDRs). | This subgroup contains aldehyde reductase and flavonoid reductase of the extended SDR-type and related proteins. The related flavonoid reductases act in the NADP-dependent reduction of flavonoids, ketone-containing plant secondary metabolites. Extended SDRs are distinct from classical SDRs. |
cd08958 | Flavonoid reductase (FR), extended (e) SDRs. | This subgroup contains FRs of the extended SDR-type and related proteins. These FRs act in the NADP-dependent reduction of flavonoids, ketone-containing plant secondary metabolites; they have the characteristic active site triad of the SDRs (though not the upstream active site Asn) and a NADP-binding motif that is very similar to the typical extended SDR motif. | |
Terpenes | cd00684 | Plant terpene cyclases, class 1. | This CD includes a diverse group of monomeric plant terpene cyclases (Tspa-Tspf) that convert the acyclic isoprenoid diphosphates, geranyl diphosphate (GPP), farnesyl diphosphate (FPP), or geranylgeranyl diphosphate (GGPP) into cyclic monoterpenes, diterpenes, or sesquiterpenes, respectively; a few form acyclic species. |
cd00868 | Terpene cyclases, class 1. | Terpene cyclases, class 1 (C1) of the class 1 family of isoprenoid biosynthesis enzymes, which share the 'isoprenoid synthase fold' and convert linear, all-trans, isoprenoids, geranyl (C10)-, farnesyl (C15)-, or geranylgeranyl (C20)-diphosphate into numerous cyclic forms of monoterpenes, diterpenes, and sesquiterpenes. Also included in this CD are the cis-trans terpene cyclases such as trichodiene synthase. | |
cd02892 | Squalene cyclase (SQCY) domain subgroup 1. | Found in class II terpene cyclases that have an alpha 6 - alpha 6 barrel fold. Squalene cyclase (SQCY) and 2,3-oxidosqualene cyclase (OSQCY) are integral membrane proteins that catalyze a cationic cyclization cascade converting linear triterpenes to fused ring compounds. | |
cd00385 | Isoprenoid biosynthesis enzymes, class 1. | Superfamily of trans-isoprenyl diphosphate synthases (IPPS) and class I terpene cyclases which either synthesis geranyl/farnesyl diphosphates (GPP/FPP) or longer chained products from isoprene precursors, isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP), or use geranyl (C10)-, farnesyl (C15)-, or geranylgeranyl (C20)-diphosphate as substrate. These enzymes produce a myriad of precursors for such end products as steroids, cholesterol, sesquiterpenes, heme, carotenoids, retinoids, and diterpenes. | |
cd00867 | Trans-Isoprenyl Diphosphate Synthases. | ||
cd00685 | Trans-Isoprenyl Diphosphate Synthases, head-to-tail. | These trans-isoprenyl diphosphate synthases (trans_IPPS) catalyze head-to-tail (HT) (1'-4) condensation reactions. This CD includes all-trans (E)-isoprenyl diphosphate synthases which synthesize various chain lengths (C10, C15, C20, C25, C30, C35, C40, C45, and C50) linear isoprenyl diphosphates from precursors, isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP). Farnesyl diphosphate synthases produce the precursors of steroids, cholesterol, sesquiterpenes, farnsylated proteins, heme, and vitamin K12; and geranylgeranyl diphosphate and longer chain synthases produce the precursors of carotenoids, retinoids, diterpenes, geranylgeranylated chlorophylls, ubiquinone, and archaeal ether linked lipids. | |
Phenolic compounds | cd13897 | The third cupredoxin domain of the plant laccases. | Laccase is a blue multicopper oxidase (MCO) which catalyzes the oxidation of a variety aromatic - notably phenolic and inorganic substances coupled to the reduction of molecular oxygen to water. Laccase has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism. |
cd13849 | The first cupredoxin domain of plant laccases. | ||
Shikimate pathway | cd00502 | Type I 3-dehydroquinase, (3-dehydroquinate dehydratase or DHQase). | Dehydroquinase is the third enzyme in the shikimate pathway, which is involved in the biosynthesis of aromatic amino acids. Type I DHQase exists as a homodimer. |
COG0169 | Shikimate 5-dehydrogenase. | Amino acid transport and metabolism. | |
COG0703 | Shikimate kinase. | ||
cd05213 | NADP-binding domain of glutamyl-tRNA reductase. | Glutamyl-tRNA reductase catalyzes the conversion of glutamyl-tRNA to glutamate-1-semialdehyde, initiating the synthesis of tetrapyrrole. Whereas tRNAs are generally associated with peptide bond formation in protein translation, here the tRNA activates glutamate in the initiation of tetrapyrrole biosynthesis. | |
cd01065 | NAD(P) binding domain of Shikimate dehydrogenase. | Shikimate dehydrogenase (DH) is an amino acid DH family member. Shikimate pathway links metabolism of carbohydrates to de novo biosynthesis of aromatic amino acids, quinones and folate. | |
PF01488.19 | Shikimate / quinate 5-dehydrogenase. | This family contains both shikimate and quinate dehydrogenases. Shikimate 5-dehydrogenase catalyses the conversion of shikimate to 5-dehydroshikimate. This reaction is part of the shikimate pathway which is involved in the biosynthesis of aromatic amino acids. Quinate 5-dehydrogenase catalyses the conversion of quinate to 5-dehydroquinate. This reaction is part of the quinate pathway where quinic acid is exploited as a source of carbon. Both the shikimate and quinate pathways share two common pathway metabolites 3-dehydroquinate and dehydroshikimate. | |
cd05312 | NAD(P) binding domain of malic enzyme (ME), subgroup 1. | Malic enzyme (ME), a member of the amino acid dehydrogenase (DH)-like domain family, catalyzes the oxidative decarboxylation of L-malate to pyruvate in the presence of cations (typically Mg++ or Mn++) with the concomitant reduction of cofactor NAD+ or NADP+. ME has been found in all organisms and plays important roles in diverse metabolic pathways such as photosynthesis and lipogenesis. This enzyme generally forms homotetramers. | |
cd01079 | NAD binding domain of methylene-tetrahydrofolate dehydrogenase. | The NAD-binding domain of methylene-tetrahydrofolate dehydrogenase (m-THF DH). M-THF is a versatile carrier of activated one-carbon units. | |
cd05212 | NAD(P) binding domain of methylene-tetrahydrofolate dehydrogenase and methylene-tetrahydrofolate dehydrogenase/cyclohydrolase. | ||
Plant transporters | cd03264 | ABC-type multidrug transport system, ATPase component. | The biological function of this family is not well characterized, but display ABC domains similar to members of ABCA subfamily. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. |
cd03235 | ATP-binding cassette domain of the metal-type transporters. | This family includes transporters involved in the uptake of various metallic cations such as iron, manganese, and zinc. The ATPases of this group of transporters are very similar to members of iron-siderophore uptake family suggesting that they share a common ancestor. | |
cd03246 | ATP-binding cassette domain of PrtD, subfamily C. | This family represents the ABC component of the protease secretion system PrtD, a 60-kDa integral membrane protein sharing 37% identity with HlyB, the ABC component of the alpha-hemolysin secretion pathway, in the C-terminal domain. They export degradative enzymes by using a type I protein secretion system and lack an N-terminal signal peptide but contain a C-terminal secretion signal. | |
cd03297 | ATP-binding cassette domain of the molybdenum transport system. | ModC is an ABC-type transporter and the ATPase component of a molybdate transport system that also includes the periplasmic binding protein ModA and the membrane protein ModB. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides and more complex organic molecules. | |
cd03300 | ATP-binding cassette domain of the polyamine transporter. | PotA is an ABC-type transporter and the ATPase component of the spermidine/putrescine-preferential uptake system consisting of PotA, -B, -C, and -D. PotA has two domains with the N-terminal domain containing the ATPase activity and the residues required for homodimerization with PotA and heterdimerization with PotB. | |
COG1127 | ABC-type transporter Mla maintaining outer membrane lipid asymmetry, ATPase component MlaF. | Cell wall/membrane/envelope biogenesis. | |
cd03369 | ATP-binding cassette domain 2 of NFT1, subfamily C. | Domain 2 of NFT1 (new full-length MRP-type transporter 1). NFT1 belongs to the MRP (multidrug resistance-associated protein) family of ABC transporters. | |
cd02079 | P-type heavy metal-transporting ATPase. | Heavy metal-transporting ATPases (Type IB ATPases) transport heavy metal ions (Cu(+), Cu(2+), Zn(2+), Cd(2+), Co(2+), etc.) across biological membranes. | |
cd00994 | Glutamine binding domain of ABC-type transporter; the type 2 periplasmic binding protein fold. | This periplasmic substrate-binding component serves as an initial receptor in the ABC transport of glutamine. GlnH belongs to the type 2 periplasmic-binding fold protein (PBP2) superfamily, whose members are involved in chemotaxis and uptake of nutrients and other small molecules from the extracellular space as a primary receptor. | |
cd02076 | Plant and fungal plasma membrane H(+)-ATPases. | This subfamily includes eukaryotic plasma membrane H(+)-ATPase which transports H(+) from the cytosol to the extracellular space, thus energizing the plasma membrane for the uptake of ions and nutrients. | |
cd01431 | ATP-dependent membrane-bound cation and aminophospholipid transporters. | The P-type ATPases, are a large family of integral membrane transporters that are of critical importance in all kingdoms of life. They generate and maintain (electro-) chemical gradients across cellular membranes, by translocating cations, heavy metals and lipids. | |
cd03227 | ATP-binding cassette domain of non-transporter proteins. | ABC-type class 2 contains systems involved in cellular processes other than transport. These families are characterized by the fact that the ABC subunit is made up of duplicated, fused ABC modules (ABC2). | |
cd06340 cd06349 | ABC transporter periplasmic binding domain, type 1 (ATPase Binding Cassette). | This subgroup includes the type 1 periplasmic ligand-binding domain of uncharacterized ABC (ATPase Binding Cassette)-type active transport systems that are predicted to be involved in transport of amino acids, peptides, or inorganic ions. | |
Defense plant pathogens | cd05489 | TAXI-I inhibits degradation of xylan in the cell wall. | Xylanase inhibitor-I (TAXI-I) is a member of potent TAXI-type inhibitors of fungal and bacterial family 11 xylanases. Plants developed a diverse battery of defense mechanisms in response to continual challenges by a broad spectrum of pathogenic microorganisms. Their defense arsenal includes inhibitors of cell wall-degrading enzymes, which hinder a possible invasion and colonization by antagonists. |
cd01960 | nsLTP1: Non-specific lipid-transfer protein type 1 (nsLTP1) subfamily. | Plant nsLTPs are small, soluble proteins that facilitate the transfer of fatty acids, phospholipids, glycolipids, and steroids between membranes. In addition to lipid transport and assembly, nsLTPs also play a key role in the defense of plants against pathogens. They may be involved in the formation of cutin layers on plant surfaces by transporting cutin monomers. | |
cd01751 | LAT/ LH2 domain of plant lipoxygenase related proteins. | Lipoxygenases are nonheme, nonsulfur iron dioxygenases that act on lipid substrates containing one or more (Z,Z)-1,4-pentadiene moieties. In plants, the immediate products are involved in defense mechanisms against pathogens and may be precursors of metabolic regulators. The generally proposed function of PLAT/LH2 domains is to mediate interaction with lipids or membrane bound proteins. | |
smart01037 | Pathogenesis-related protein Bet v I family. | This family is named after Bet v 1, the major birch pollen allergen. This protein belongs to family 10 of plant pathogenesis-related proteins (PR-10), cytoplasmic proteins of 15-17 kd that are wide-spread among dicotyledonous plants (PUBMED:9417891). |
This study included the search for conserved protein domain families involved in defense against plant pathogens. Within this functional group, we identified xylanase inhibitor I (TAXI-I) -a member of the potent TAXI-type inhibitors of fungal and bacterial family 11 xylanases (cd05489), plant nsLTP proteins facilitating the transfer of fatty acids, phospholipids, glycolipids, and steroids between membranes and participating in the formation of cutin layers on the surface of plants through the transport of cutin monomers, which play a key role in the defense of plants against pathogens (cd01960), and the LAT/LH2 domain of lipoxygenase related proteins -their immediate products in plants being involved in defense mechanisms against pathogens and that might be precursors of metabolic regulators (cd01751).
Enzymes involved in the secondary metabolism of species in the family Cactaceae in relation to species in other plant classes and families
The results of our analysis of the evolutionary relationships of enzymes involved in the secondary metabolism in the analyzed genomes of members of the Cactaceae family showed high conservation of the aspartic proteinase-like protein (Fig. 2), flavanone 3-hydroxylase (F3H) (Fig. 3), hydroxycinnamoyl-CoA shikimate quinate hydroxycinnamoyl transferase (HCT) (Fig. 4), serine threonine-protein phosphatase (Fig. 5) in Cereus fernambucensis (JALPLW01.1), Selenicereus undatus (JACYFF01.1), Stenocereus thurberi (NCQT01.1), Lophocereus schottii (NCQV01.1), Pachycereus pringlei (NCQS01.1), Pereskia humboldtii (NCQU01.1), and Carnegiea gigantea (NCQR01.1). In Cereus fernambucensis, we found divergence in PEN3 (Fig. 6) and the absence of sterol methyltransferase (SMT1; Fig. 7). To understand the conservation of the enzyme flavanone 3-hydroxylase (F3H) (Fig. 3), we modeled the three-dimensional structures of aa translated sequences we identified in the genomes of cacti and compared them with those of other plant accessions (Fig. 8). The three-dimensional modeling of a dimer created by means of a consensus sequence of the regions we obtained in our analysis of the genomes of species in the Cactaceae family (Fig. 8(B)), and its comparison with the reference model 1gp4.1A anthocyanidin synthase described in Arabidopsis thaliana (Fig. 8(A)) [34], indicated 88 % identity relative to the reference model (Fig. 8(A) - 8(B)). The tertiary structure modeling including 10 plant accessions -Gossypium arboreum (XP_017638014.1), Vernicia fordii (ARV78456.1), Hibiscus sabdariffa (ALB35017.1), Durio zibethinus (XP_022741107.1), Althaea officinalis (UOI87842.1), Dimocarpus longan (ABO48521.1), Arachis hypogaea (XP_025694592.1), Boehmeria nivea (QBC98316.1), Ziziphus jujuba (XP_015889639.1), and Litchi chinensis (ADO95201.1)- showed 30 % similarity relative to the reference model (Fig. 8 C-L). In none of the models generated in the plant accessions was it possible to identify the binding sites for 2-oxoglutaric acid (AKG) and 2-(N-morpholino)-ethanesulfonic acid (MES). However, where the domain corresponding to the 2OG-Fe (II) oxygenase superfamily was identified, a region of 136 aa was conserved between the analyzed plant accessions (Fig. 8(M)). The 2-oxoglutarate iron-dependent domain defines many classes of flavonoid enzymes. Enzymes with 2-oxoglutarate (2OG) and Fe (II)-dependent oxygenase domains include the C-terminal of prolyl 4-hydroxylase alpha subunit. The holoenzyme has the activity EC:1.14.11.2 oxygenase mixed function that catalyzes the hydroxylation of a prolyl-glycyl-containing peptide, usually in protocollagen, to a hydroxyprolyl-glycyl-containing peptide. The enzyme utilizes molecular oxygen with a concomitant oxidative decarboxylation of 2-oxoglutaramate to succinate. The full enzyme consists of an alpha2-beta2 complex with the alpha subunit contributing most of the parts of the active site. The family also includes lysyl hydrolases, isopenicillin synthases and AlkB.
As we demonstrated in this study, even though a large region of the 2-oxoglutarate iron-dependent domain proved to be conserved among the analyzed plant accessions, modeling of their three-dimensional structures indicated changes in the aa sequences of the a-binding 2-oxoglutaric acid (AKG) and 2-(N-morpholino)-ethanesulfonic acid (MES), possibly indicating point mutations affecting these aa residues. These enzymes are in constant evolution and perhaps depending on the secondary metabolism capabilities of each plant species, it leads to setting mutations that favor enzyme specialization [43-45]. Studies directed at identifying specific changes in these secondary metabolism enzymes might be a near-term project that would allow us to use these enzymes in genetic engineering for improving flavonoid production in plants.
Conclusion
Obtaining bioactive compounds from plants and their application in the food industry and medicine is a reality today. Species belonging to the family Cactaceae could be used to extract different bioactive compounds and the results of bioinformatic analyses would allow us to advance in the search for enzymes related to secondary metabolism. However, little progress exists in the sequencing of genomes of members belonging to the family Cactaceae. In this study we were able to demonstrate that using bioinformatics tools it is possible to select candidate genes that participate in secondary metabolism and, in the future, use them to carry out genetic improvement in plants aimed at enhancing the yields of active compounds of interest.