Discovery of a new class of small RNAs
Our genome codifies hundreds of genes responsible for a myriad of cellular functions. The regulation of the levels of expression of these genes is crucial for development and homeostasis1. Around 80% of the human genome has been transcribed, but only 2% codifies for proteins. One result of the transcription process of genomes is the production of thousands of non-coding RNAs (ncRNAs)2. While the number of ncRNAs in the human genome is unknown, transcriptomic and bioinformatic studies suggest that there may be thousands of them2. ncRNAs are classified into two types: long non-coding RNAs (lncRNAs), which have a length greater than 200 nucleotides, and small non-coding RNAs (sncRNAs), whose length is 20-35 nucleotides3. The most widely studied sncRNAs are microRNAs (miRNAs) and small interference RNAs (siRNAs). In contrast, due to their recent discovery, the PIWI-associated small RNAs (piRNAs) have not been studied in depth at present4.
Both siRNAs and miRNAs associate with the Argonaute family of proteins to perform their functions and act as guides that regulate mRNAs stability, protein synthesis, chromatin organization, and genome structure5,6. The Argonaute family is divided into two sub-families of proteins: Argonaute (AGO) and PIWI (P-element Induced Wimpy)1. The proteins of the PIWI sub-family participate predominantly in specific events of the germinal line. However, the initial study on the Drosophila gene PIWI determined that its germline function depends on somatic cells of the gonad7, although the functions of this family and the nature of the piRNAs that serve as guides were unknown then.
Four independent research groups discovered the piRNAs8-11. They were initially isolated from total RNA extracted from mouse testicles. The first observation in a gel stained with ethidium bromide9 or SYBR gold8 revealed a group of small RNAs approximately 28-32 nucleotides long. These RNAs abundance in the testicle led to speculation about their association with the PIWI subfamily of proteins, given that these proteins have been well documented for their essential roles in germline development and gametogenesis on various animal models1,12,13. Three PIWI proteins (MIWI/PIWIL1, MIWI2/PIWIL4, and MILI/PIWIL2) are found in mice and have essential roles in spermatogenesis, each of them showing a unique expression pattern14, although the expression of MIWI and MILI differs. Initial studies showed that MILI protein is expressed from the mitotic stage to the pachytene phase of the meiotic stage. Meanwhile, MIWI is expressed from the middle stage of the pachytene phase to the formation of the early spermatids. The expression of both is observed in the middle stage of the pachytene phase15.
Based on this knowledge, immunoprecipitation8 assays were conducted to determine their dependence on MIWI. Findings showed that the piRNAs bonded to MIWI but not to AGO2. Meanwhile, assays with MIWI knockout mice showed that the expression of these piRNAs decreased. These results suggest that the expression of this new class of RNAs is dependent on MIWI. For this reason, they were denominated PIWI-interacting RNAs (piRNAs)9. Similar studies that explored the association of MILI with this new class of small RNAs observed RNAs of 26-28 nucleotides associated with MILI7. These findings indicated two classes of piRNAs: the first, with a length of 28-32 nucleotides, associated primarily with MIWI, and the second, with a length of 26-28 nucleotides, associated with MILI.
Upon characterizing these RNAs in greater detail, findings revealed that piRNAs preferred uracil in the first position8 and were distributed irregularly along the chromosomes. The piRNAs were found to be codified as follows: 17.6% in chromosome 17; 11.6% in chromosome 5; 10.7% in chromosome 4; and 10.2% in chromosome 2. Only two piRNAs were found in the X chromosome, and almost none in chromosomes 1, 3, 16, 19, and Y9. Other observations showed that the vast majority of piRNAs (96%)8 formed groups in short genomic loci from < 1 kb to > 100 kb in size that contained between 100-4500 piRNAs. These groups are known as "clusters"7.
Processing of piRNAs
The characteristics and functions of the biogenesis of piRNAs have been studied principally in Drosophila melanogaster, Caenorhabditis elegans, and Mus musculus16,17. In contrast to miRNAs and siRNAs, whose biogenesis depends on Dicer and Drosha enzymes to convert their double-stranded precursors into small functional RNAs, piRNAs originate from one sole RNA strand that does not need Dicer or Drosha enzymes. However, piRNAs require an alternative type of processing composed of two pathways: the primary processing pathway and the secondary "ping-pong cycle" (Figure 1)5,18,19.
Primary biogenesis
piRNAs come from protein-coding genes, clusters, active transposable elements, lncRNAs, transference RNAs, and small nucleolar RNAs20-24. The precursors of single-stranded piRNAs are transcribed and processed to generate intermediate piRNAs. Later, they are transported through the nuclear envelope to processing sites that reside in the cytoplasm25. It is believed that their processing in the germinal cells takes place in multiprotein, perinuclear structures called "nuages," or chromatid body, and that delivery of the transcribed piRNAs from the clusters to the processing sites requires the DEAD-box helicase associated with U2AF65 (UAP56, also called Hel25E)26,27. In contrast, in somatic cells of ovaries and testis, the production of piRNAs is performed in structures called Yb bodies. These structures are frequently associated with mithocondria27.
Intermediate piRNAs are processed at their 5' end by the Zucchini/MITOPLD nucleases in D. melanogaster and M. musculus following their exportation to the cytoplasm28-30. Additional observations have found that primary biogenesis in these organisms depends on the function of other conserved factors, such as Minotaur (Mino)/GPAT2 and Gasz31,32. Also, helicase MOV10L1 (mice homolog of Armitage in Drosophila) is associated with the first cleavage step of piRNA processing, and its function has been related to remodeling secondary structures of those precursors33,34. Interestingly, all these proteins, except Armi/MOVO10L1, are localized in the external mitochondrial membrane, suggesting an essential function of mitochondria in the primary processing of piRNAs32,35-39.
Intermediate piRNAs bind to the PIWI proteins, a union that requires the Heat shock 90 protein (Hsp90) and the cochaperone Shutdown (Shu)40-42. The current model of the biogenesis of piRNAs suggests that the characteristic size of mature piRNAs is a consequence of the union of intermediate piRNAs with PIWI proteins, followed by clipping performed at their 3' ends by the exonuclease Nibbler43.
One report suggests that the Yb protein with Tudor domains binds directly to intermediate piRNAs via its N-terminal domain and shows homology with the DEAD-box helicase44. Since germinal cells do not possess Yb bodies, their function is probably carried out by two homologs known as Brother of Yb (BoYb) and Sister of Yb (SoYb)44. Vreteno, another protein with Tudor domains, is essential for the biogenesis of piRNAs in germinal and follicular cells by enabling the correct localization of the PIWI proteins20,42,44-45.
In the final step, Hen1 methylates the intermediate piRNAs associated with PIWI at their 3' ends to generate mature piRNAs1,46. It appears that this modification is naturally protective since it is found in the majority of the sncRNAs that guide the Argonaute proteins to their target sequences via an almost perfect complementarity to produce the clipping of the transcribed target43,47-48.
Secondary biogenesis
Alternatively, mature piRNAs can act as guides for the generation of secondary piRNAs. Secondary biogenesis, first described in D. melanogaster and known as the ping-pong cycle25, constitutes an adaptive amplification pathway of piRNAs and initiates the degradation of the target elements and the transposons mRNA through post-transcriptional silencing25. Primary piRNAs, which typically begin their 5' end with uridine (1U) and are bonded to Aubergine (AUB), show complementarity with ten nucleotides of the secondary piRNAs that usually contain adenosine in position 10 (10A) and are bonded to Argonaute 3 (AGO3)27. This complementarity modulates the amplification that generates new secondary piRNAs, which occurs in the form of a ping-pong cycle between the sequences associated with AGO3 and AUB27,49,50.
The antisense primary piRNAs from clusters associate with AUB and detect and clip RNA transcripts to produce the 5' end of new sense piRNAs. After binding to AGO3, this compound recognizes and clips the transcripts from clusters, thus generating more antisense piRNAs with sequences similar or identical to the original piRNA, which can bind again to AUB to complete the ping-pong cycle25,51,52. The piRNAs generated in this cycle adapt to the target through a variation in their sequence53,54. This pathway leads to a target-dependent amplification of piRNAs and the expansion of diverse piRNA sequences46,55.
A recent study showed that the ping-pong cycle could function independently of Zucchini used in piRNA processing56. In the absence of Zucchini, a piRNA 5' is typically generated via slicing, but its corresponding 3' end is modified by Nibbler (Figure 1).
Regulation of genic expression
The interaction of piRNAs with the proteins of the PIWI sub-family generates the formation of a ribonucleoprotein known as the piRNA-induced silencing complex (piRISC), which can recognize and silence complementary sequences at the transcriptional and post-transcriptional levels57,58.
Transcriptional silencing
Various studies have analyzed the role of PIWI proteins in transcriptional silencing in D. melanogaster, and some have demonstrated that the nuclear localization of the piRISC complex is necessary for the silencing of transposable elements59. A loss of PIWI proteins decreases the H3K9me3 mark (trimethylation of the lysine 9 of the H3 histone) and increases the binding of Pol II in the transposable element promoters16,60. Together, these findings suggest a model for transcriptional silencing in which PIWI translocates to the nucleus by interacting with the transcripts, leading to a heterochromatinic conformation and transcriptional repression61. Transcriptional silencing by piRISC also requires the GTSF-1/Asterix protein, which interacts directly with PIWI and is necessary for establishing the H3K9me3 chromatinic mark62,63. The union of Panoramix (Panx, also called Silencio) to PIWI has also been identified, which helps in forming heterochromatin through methyltransferase H3K9 Eggless and its co-factor, Windei64,65. Recent studies66 have shown that SUMO E3 ligase Su(var)2-10 induces local sumoylation, leading to the recruitment of the Eggless/Windei complex. These results indicated a novel SUMO pathway in piRNA-related transcriptional regulation.
The piRISC complex generally recruits the heterochromatin protein 1 (HP1) (which binds to methylated DNA) to maintain and propagate epigenetic silencing and Su(var)3-9, a methyltransferase histone (HMT) responsible for the methylation of lysine 9 in histone 3 (H3K9) in specific genomic targets; in this way, it blocks transcription30,67,68. In addition, the lysine-specific demethylase 1 (Lsd1) removes the dimethylation of the lysine 4 on histone 3 of the promoter region of the transposons, thus promoting its efficient suppression69. Other observations show that the Maelstrom group of proteins (Mael) is necessary for the inhibition of Pol II, and its RNase activity seems dispensable for transposon silencing60,70.
In mammals, in contrast, transcriptional silencing is performed by modifying the histones and DNA methylation, which is one of the primary mechanisms in piRNAs silencing properties71-73. The piRNA/PIWI complex recruits DNA methyltransferase (DNMT) to methylate genic CpG sites, altering transcriptional activity30. In mice, the two PIWI proteins MILI and MIWI are required for DNA methylation of transposable elements73. In the testicles of mice embryos, MIWI2 enters the nucleus through interaction with MILI to promote the establishment of methylation at CpG sites of the transposons DNA71-74. Studies have also observed that the ping-pong cycle continues in mutants of MIWI2, while MILI performs the methylation of DNA via a mechanism that is independent of MIWI275. A recent study identified a protein associated with MIWI2 (SPOCD1) required for piRNA-guided transposable elements methylation and silencing76. This study provided the first mechanistic insight into mammalian piRNA-directed methylation. Despite all these findings, the cascade of events leading to transcriptional silencing in mammals is not yet understood in detail.
Post-transcriptional silencing
The clipping capacity of the piRISC complex contributes not only to the amplification of piRNA production but can also effectuate the post-transcriptional silencing of transposons77. Various studies have demonstrated that this post-transcriptional control is not unique to the RNAs of transposons but also participates in regulating other RNAs, such as mRNA, transcribed pseudogenes, and lncRNAs78-80. The post-transcriptional regulation of mRNAs requires the insertion of transposable elements related sequences into mRNA untranslated regions (UTR), the production of piRNAs from genes with similar sequences (pseudogenes), or low complementarity-based targeting of mRNAs with piRNAs produced from transposable elements or repeated sequences81.
An increasing number of noncanonical post-transcriptional mechanisms for piRNAs, besides transposon silencing, have been reported in flies, mammals, and other species25,82-90.
In fly testicles, which led to the discovery of piRNAs, the Stellate gene linked to the X chromosome is suppressed by a pseudogene in the Y chromosome called Su(Ste)91. In the absence of Su(Ste), the product of the Stellate gene accumulates to form a crystalline structure in the spermatocytes that causes infertility. The Su(Ste) locus produces piRNAs whose target is the mRNA of Stellate for its later degradation91. Significantly, 70% of the piRNAs associated with AUB in fly testicles are Su(Ste) piRNAs83.
In a related aspect, piRNAs associated with MIWI in mammals are responsible for eliminating mRNA in mouse elongating spermatides86,92. These piRNAs form a complex with the CAF1 protein and select their target mRNA by partial complementarity at the 3' UTR end, thus promoting their deadenylation and degradation30,86.
Observations show that some PIWI proteins are localized to P bodies or co-localized with components of those bodies93,94. A study conducted with ovaries of D. melanogaster demonstrated that a small fraction of AUB is found in P bodies and that the transposons transcripts are localized to P bodies in an AUB-dependent manner95. Thus, the components of P bodies may contribute to the degradation of the transcripts that are targets of AUB. Similarly, studies showed that MIWI2 is also localized in P bodies96,97, but this effect is not seen on mice deficient in piRNAs biogenesis98. This finding suggests that piRNAs are necessary for the localization of MIWI2 to P bodies. To date, these findings are still valid14 (Figure 2).
Alterations in piRNAs expression and associated diseases
Diverse studies have demonstrated that alterations in piRNAs expression can either promote or inhibit the development of diverse diseases, especially certain types of cancer, including breast, gastric, lung, prostate, colorectal, renal, and bladder cancer, and multiple myeloma (Table 1)80,99-110. However, research on piRNAs and their participation in diseases unrelated to cancer, such as respiratory ailments, is scarce.
piRNA | Type of cancer | Expression | Potential clinical utility | Reference |
---|---|---|---|---|
piR-4987 | Breast cancer | High | Diagnostic tool | Huang et al.99 |
piR-20365 | High | Prognosis biomarker | ||
piR-20485 | High | Prognosis biomarker | ||
piR-20582 | High | Prognosis biomarker | ||
piR-36712 | Low | Prognosis biomarker/ therapeutic target | Tan et al.80 | |
piR-651 | Gastric cancer | Low | Diagnostic tool | Cui et al.100 |
piR-823 | Low | Therapeutic target | ||
piR-41927 | High | Diagnostic tool/ prognosis biomarker | Lin et al.101 | |
piR-38581 | High | Diagnostic tool/ prognosis biomarker | ||
piR-651 | Lung cancer | High | Diagnostic tool/ prognosis biomarker | Li et al.102 |
piR-34871 | High | Diagnostic tool/ therapeutic target | Reeves et al.103 | |
piR-52200 | High | Diagnostic tool/ therapeutic target | ||
piR-35127 | Low | Diagnostic tool/ therapeutic target | ||
piR-46545 | Low | Diagnostic tool/ therapeutic target | ||
piR-651 | Prostate cancer | High | Therapeutic target | Öner et al.104 |
piR-823 | High | Therapeutic target | ||
piR-18849 | Colorectal cancer | High | Diagnostic tool/ prognosis biomarker | Yin et al.105 |
piR-19521 | High | Diagnostic tool/ prognosis biomarker | ||
piR-17724 | High | Diagnostic tool | ||
piR-1245 | High | Therapeutic target | Weng et al.106 | |
piR-32051 | Renal cancer | High | Prognosis biomarker | Li et al.107 |
piR-39894 | High | Prognosis biomarker | ||
piR-43607 | High | Prognosis biomarker | ||
piR-34536 | Low | Prognosis biomarker | Zhao et al.108 | |
piR-51810 | Low | Prognosis biomarker | ||
piR-594040 | Bladder cancer | Low | Diagnostic tool/ therapeutic target | Chu et al.109 |
piR-823 | Multiple myeloma | High | Prognosis biomarker/ therapeutic target | Yan et al.110 |
Respiratory diseases and piRNAs
Air pollution is an ongoing challenge for humans because various epidemiological studies associate exposure with adverse effects on healthespecially on the pulmonary systemincluding pulmonary inflammation, more susceptibility to respiratory infections, and increased risks of cancer, asthma, and chronic obstructive pulmonary disease (COPD). Numerous recent studies have associated changes in the expression of ncRNAs with the development and progression of these diseases111-122. Most studies have focused on analyzing lncRNAs and miRNAs, while only a few have examined the changes in the expression of piRNAs in these diseases. One study used bronchial smooth muscle cells from patients with asthma and healthy subjects123. Observations showed a differential expression (FC ≥ 1.3, p < 0.05) of five piRNAs (DQ596390, DQ597484, DQ595186, DQ582264, DQ597347) that could be employed as potential markers of asthma.
Another study evaluated the expression of small RNAs in CD4 T lymphocytes by sequencing124. Their findings showed that 12.3% of the sequences obtained corresponded to piRNAs. Those authors validated the expression of one piRNA (DQ570728) by RT-qPCR (FC ≥ 1, p < 0.05) and northern blot and then evaluated its function by over-expression in CD4 T lymphocytes to test its effect on cytokines. They observed that DQ570728 significantly reduced (FC ≥ 1, p < 0.05) the expression of IL-4 and IL-5, which are involved in the development and maintenance of Th2 lymphocytes. They further analyzed the clinical importance of these results by evaluating the expression of DQ570728 and IL-4 in the serum of patients with asthma and healthy subjects. In this case, they observed that the expression of DQ570728 was significantly lower (p < 0.01) in asthma patients, while the expression of IL-4 was significantly higher (p < 0.01) than in healthy individuals. The altered expression of DQ570728 correlated inversely with the expression of IL-4 (r = 0.63) in asthma patients.
Another study analyzed the effect of the respiratory syncytial virus on exosomes composition in cells from the A549 cell line125. Their results showed that the content of the piRNAs increased in the cells infected with the virus (34.7%) compared to control cells (3.9%), demonstrating that the virus infection on A549 cells was associated with changes in the content of piRNAs in the exosomes.
Furthermore, a separate study utilized small airway epithelial cells exposed to a condensate of cigarette smoke to determine the small RNAs' composition in the extracellular vesicles126. The authors identified a decrease in the expression (p < 0.05) of five piRNAs (piR36705, piR37183, piR59260, piR36924, piR52900), and an increase in the expression (p < 0.05) of two piRNAs (piR31985, piR50603) concerning controls.
Similarly, Sundar et al. analyzed the extracellular vesicles' content in the plasma of smokers, patients with COPD, and non-smokers127. They selected the piRNAs that were expressed differentially (p < 0.01) to compare the three study groups. They identified three piRNAs (piR004153, piR020813, piR020450) in smokers and non-smokers; two piRNAs (piR012753, piR020813) in non-smokers and COPD patients; and four piRNAs (piR004153, piR020813, piR020450, piR016735) in smokers and COPD patients.
These studies demonstrate the differential expression of piRNAs in various diseases, although only one study performed a functional analysis that demonstrated the capacity of piRNAs to regulate the function of other genes. Therefore, this analysis is essential to understand piRNAs function in the development and progression of these diseases and the possibility of utilizing them as biomarkers and therapeutical targets128,129. Unlike other ncRNAsfor example, the lncRNAspiRNAs are not easily degraded and can efficiently pass through the cell membrane130. These characteristics allow piRNAs to be detected in samples that are easy to collect, such as serum, plasma, blood, and urine. One study demonstrated that the piR-57125, implicated in renal cancer, is readily detected in serum and plasma samples131.
Perspectives
As the discovery of piRNAs occurred a decade ago, many functions of the proteins that participate in their biogenesis are still unknown. However, we know that numerous factors participate in carrying out the transcriptional and post-transcriptional regulation of transposons and mRNA. The precise mechanisms involved in these functions are still under study, with most published reports focusing on attempts to elucidate them while setting aside analyses of the expression of these piRNAs in different cell lines under distinct types of stress, for example, components of environmental contamination in general.
In recent years, evidence has shown that many environmental contaminants alter the epigenome by modifying the state of DNA methylation, histones, or the expression of ncRNAs. Two of the main questions that need to be answered are how the action of different contaminants affects the expression of piRNAs and whether this expression has any functional importance for the diseases associated with prolonged exposure due to differential piRNAs expression. The answers to these questions will help understand better part of the complex mechanisms through which environmental contaminants generate changes in the genome.
Finally, it is essential to emphasize the potential use of piRNAs as therapeutic targets in various diseases, whether by blocking their expression or taking advantage of their characteristics through synthetic piRNAs capable of blocking protein synthesis by binding to mRNA. These possibilities represent another opportunity with potential applications in the fields of both biomedicine and clinical medicine.