INTRODUCTION
Gastric cancer (GC) is classified under three types: intestinal, diffuse, and mixed type, according to Lauren's criteria. Intestinal and diffuse GC exhibit numerous differences in pathology, epidemiology, and etiology. Diffuse GC (DGC) lacks cell adhesion and is characterized by the presence of signet-ring cells1,2. Some factors that promote DGC development are the loss of E-cadherin expression and pathogenic variants in the CDH1 gene3.
The E-cadherin protein is a transmembrane glycoprotein responsible for maintaining stable contacts between epithelial cell types. It is encoded by the CDH1 gene, located on chromosome 16q22.1; the gene has 16 exons and a length of 98,250 bp, and the most common isoform is translated by mRNA variant 1 with 4.5 kb and 882 aa (NCBI Reference Sequence: NM_004360.3)4,5. E-cadherin expression has been studied by immunohistochemistry (IHC) in DGC patients of different ethnicities, and the percentage of cases with reduced expression of this protein observed worldwide is in the range of 41.7-100% (Table 1). In Mexico, only one study reported that 90% of DGC cases had reduced expression of E-cadherin6.
Population | DGC cases (n) | Cases with low expression (%) | Reference |
---|---|---|---|
Mexican | 18 | 55.6 | This study |
Mexican | 69 | 90.0 | 6 |
Chinese | 75 | 65.3 | 25 |
Egyptian | 11 | 41.7 | 26 |
Egyptian | 22 | 64.0 | 27 |
English | 24 | 54.0 | 28 |
German | 115 | 54.8 | 29 |
Greek | 21 | 55.0 | 30 |
Indian | 30 | 50.0 | 31 |
Iranian | 4 | 100 | 32 |
Italian | 29 | 62.0 | 33 |
Japanese | 47 | 63.8 | 34 |
Japanese | 75 | 55.0 | 35 |
Japanese | 76 | 67.1 | 36 |
Korean | 25 | 80.0 | 37 |
Peruvian | 90 | 60.0 | 38 |
Polish | 30 | 53.3 | 39 |
Romanian | 61 | 82.4 | 40 |
Ukrainian | 33 | 65.8 | 41 |
American | 116 | 58.0 | 42 |
On the other hand, over 700 variants in the CDH1 gene have been described in DGC patients in different populations at the somatic and germline levels7-10, providing valuable information on the spectrum of variants in this gene related to GC pathology. To date, 17 variants in the CDH1 gene have been reported in Mexico, most of which are not pathogenic6,11. Although GC ranks fifth in terms of cancer mortality rates in Mexico12, it has not been sufficiently studied. This study aimed to analyze the expression of E-cadherin and identify CDH1 gene variants in diffuse gastric tumors of Mexican patients.
METHODS
We analyzed samples of paraffin-embedded gastric tumors that were obtained by biopsy through endoscopy belonging to 18 patients with a histopathological diagnosis of DGC with the presence of signet ring cell and/or poorly differentiated. All patients with early, locally advanced, or metastatic DGC were included in the study. Samples with intestinal type or mixed type of GC were not included in the study. Samples that did not have sufficient quality and quantity of cells (< 60% tumoral cells) for analysis were excluded from the study. The inclusion process of the cases was consecutive and samples were collected between 2012 and 2017 in the pathology service of Hospital 110 of the Mexican Social Security Institute (IMSS) in the city of Guadalajara, Jalisco, Mexico. This research was approved by the Local Ethics and Research Committees (Comité de Ética en Investigación 13058 and Comité Local de Investigación en Salud 1305) of Western Biomedical Research Center, IMSS.
The paraffin-embedded tissue samples were cut on a microtome into 5 μm-thick sections. Immediately, they were dewaxed in the stove at 60°C for 1 h, followed by a xylol and ethanol series and finally water. The slides described above underwent heat-mediated antigen retrieval using the pressure cooker method with citrate buffer at pH 6.0 for 25 min. Tissues were washed in 0.1 M phosphate-buffered saline (PBS) 3 times for 5 min each time. Slides were incubated in PolyDetector AP Blocker (cat. no. BSB 0055; Bio SB, Inc., Santa Barbara, CA, United States of America [USA]) for 5 min and washed in 0.1 M PBS buffer 3 times for 5 min each time. Antibody incubation was carried out using NovocastraTM Liquid Mouse Monoclonal Antibody anti-E-cadherin (Leica Biosystems Nussloch GmbH, Germany) overnight at 4°C. Tissues were then washed in 0.1 M PBS buffer 3 times for 5 min each time. The immunohistochemical process was continued using a Dako LSAB System-HRP system (K0675; Agilent Technologies, USA); the biotinylated secondary antibody and streptavidin-conjugated horseradish peroxidase were used according to the manufacturer's protocol. Immunodetection was visualized using diaminobenzidine tetrahydrochloride (DAB; Sigma-Aldrich; Merck Millipore, Darmstadt, Germany). Slides were counterstained with hematoxylin for 2 min (cat no. BSB 0024; Bio-SB, Inc.) and mounted with Entellan® (cat. no. 107960; Merck-Millipore). The staining results were scored on a semiquantitative scale established by a pathologist based on the percentage of cells with loss of membranous staining and were classified into two categories: normal (≥ 91% of stained cells) and reduced (≤ 90% stained cells).
Tumoral DNA was isolated using an Invisorb® Spin Tissue Mini Kit (Invitek Molecular GmbH, Germany). The promoter region and 16 exons of the CDH1 gene were analyzed by PCR using previously described primers11,13. The resulting fragments were cleaned with ExoSAP-IT™ PCR Product Cleanup Reagent (Thermo Fisher, USA). For Sanger sequencing, a Big Dye Terminator kit v.3.1 and ABI 310 Genetic Analyzer (Applied Biosystems, Foster City, CA, USA) were used for capillary electrophoresis; the Chromas Lite v.2.6.6 program was used for data analysis. The proportion of cases with reduced E-cadherin expression and frequencies of CDH1 variants was obtained by direct counting and Chi-square test; a haplotype analysis was performed by Arlequin v3.5 software. Predictive analysis was performed in the variants with unknown pathogenicity or uncertain reports. Four programs were used, NNSplice and NetGene2 for intronic variants14,15, and HOPE and PolyPhen-2 for missense variants16,17.
RESULTS
Overall, 18 DGC patients were analyzed, ten women (56%) and eight men (44%). The age range was 28-76 years, with an average of 59 ± 14.3 years. Reduced E-cadherin expression was observed in 56% of the DGC patients, whose values of stained cells ranged between 2% and 90% (X̄ = 54.5%). The remaining 44% showed normal expression, with a percentage of stained cells between 95% and 100% (X̄ = 98.3%).
Through CDH1 gene sequencing, 12 different SNPs were identified: c.−285C > A rs16260, c. −273G > A rs1330727101, c.48 + 6C > T rs3743674, c.388-83G > A rs140449923, c.388-48A > T,G rs766910353, c.531 + 4G > A rs1172442868, c.1321-10T > C,G rs1597897718, c.2076T > C rs1801552, c.2164 + 17dupA rs34939176, c.2253C > T rs33964119, c.2331C > G rs114265540, and c.*54C > T rs1801026 (note that new alleles were observed in our study) (Table 2). Each patient presented a minimum of three variants and a maximum of seven, with c.2076T > C being the most common variant observed in a homozygous state in all patients.
Location | Variant | This study | 1KGP MXL | Clinvar | New alleles | ||
---|---|---|---|---|---|---|---|
Genotypic frequency | Allelic frequency | Genotypic frequency | Allelic frequency | ||||
Promoter | c.-285C>Ars16260 | CC:0.72CA:0.17AA:0.11 | C:0.81A:0.19 | CC:0.60CA:0.31AA: 0.09 p = 0.48 |
C:0.75A:0.25 p = 0.49 |
B HDGC | NR |
c. -273G>A rs1330727101 | GG:0.89GA: 0.11AA: 0.00 | G:0.94A: 0.06 | NR | NR | NRa | NR | |
Intron 1 | c.48+6C>Trs3743674 | CC: 0.06CT: 0.50TT: 0.44 | C: 0.31T: 0.69 | CC: 0.03CT: 0.53TT: 0.44 p = 0.96 |
C: 0.30T: 0.70 p = 0.92 |
B HDGC | NR |
Intron 3 | c.388-83G>A rs140449923 | GG: 0.56GA: 0.44AA:0.00 | G: 0.78A: 0.22 | GG: 1.00GA 0.00AA:0.00 p < 0.001 |
G: 1.00A 0.00 p < 0.001 |
NR | NR |
c.388-48A>G rs766910353 | AA:0.94AG:0.06GG:0.00 | A:0.97G:0.03 | NR | NR | NR | G | |
Intron 4 | c.531+4G>A rs1172442868 | GG:0.94GA:0.06AA:0.00 | G:0.97A:0.03 | NR | NR | U | NR |
Intron 9 | c.1321-10T>G rs1597897718 | TT:0.83TG:0.17GG:0.00 | T:0.92G:0.08 | NR | NR | PB | G |
Exon 13 | c.2076T>C, p.Ala692 = rs1801552 | TT:0.00TC:0.00CC:1.00 | T: 0.00C: 1.00 | TT:0.11TC:0.42CC:0.47 p < 0.001 |
T:0.32C:0.68 p < 0.001 |
B | NR |
Intron 13 | c.2164+17dupA rs34939176 | AA:0.11AAA:0.89AAAA:0.00 | A:0.56AA:0.44 | AA:0.80AAA:0.2AAAA:0.00 p < 0.001 |
A:0.90AA:0.10 p < 0.001 |
B | NR |
Exon 14 | c.2253C>T, p.Asn751 = rs33964119 | CC:0.39CT:0.50TT:0.11 | C:0.64T:0.36 | CC:0.80CT:0.20TT:0.00 p = 0.008 |
C:0.90T:0.10 p < 0.001 |
B | NR |
Exon 15 | c.2331C>G, T, p.Asp777Glu rs114265540 | CC:0.83CG:0.17GG:0.00 | C:0.92G:0.08 | CC:1.00CG:0.00GG:0.00 p = 0.03 |
C:1.00G:0.00 p = 0.009 |
U | NR |
UTR 3' | c.*54C>T rs1801026 | CC:0.56CT:0.44TT:0.00 | C:0.78T:0.22 | CC:0.66CT:0.31TT:0.03 p = 0.78 |
C:0.81T:0.19 p = 0.64 |
PB/B | NR |
1KGP: 1000 Genomes Project; B: benign; NR: not reported; PB: probably benign; U: uncertain; HDGC: hereditary diffuse gastric cancer. aVariant previously analyzed (58); the allele “A” could modify the binding sites of the transcription factors IRF3 and SP2.
A comparative analysis was performed between the genotype frequency distribution of each of the variants and E-cadherin protein expression. Results showed that the genotypes were similarly distributed among patients with normal and reduced expression (p > 0.05); no variant was related to the reduced expression of E-cadherin.
In addition, haplotypes were established with the 12 identified variants (ordered from 5' to 3') to relate them to the expression of E-cadherin. The results showed 21 different combinations, and the most frequent was CGTGAGTCACCC (36.1%, n = 13 chromosomes). This haplotype comprises wild-type alleles at all sites, except the variants c.48 + 6C > T and c.2076T > C (third and eighth positions). The other haplotypes were observed with very low frequencies (<6%). The distribution of haplotype frequencies between patients with normal versus reduced E-cadherin expression did not show significant differences (p = 1.0).
Genotypic and allelic frequencies were compared with the data reported in a population with Mexican ancestry residing in Los Angeles, USA (MXL, n = 71 individuals) in the 1000 Genomes Project (1KGP)18 to determine whether any of the identified variants could be associated with the risk of developing DGC. Five variants, c.388-83G > A, c.2076T > C, c.2164 + 17dupA, c.2253C > T, and c.2331C > G, showed statistically significant differences (p < 0.05) (Table 2). The odds' ratios (OR) were calculated using the Cochran-Armitage trend test19 under classical Mendelian inheritance patterns (dominant, recessive and codominant). The association or risk was evaluated with the possible genotypes (wildtype, heterozygous, or homozygous) for each variant. The results were significant for c.2164 + 17dupA (OR = 31.4, 95% CI 6.39-154.08, p = 0.00001) under the dominant model (dupA/dupA or dupA/A vs. A/A), and for c.2253C>T (OR = 6.16, 95% CI 2.00-19.01, p = 0.0008 and OR = 3.92; 95% CI 1.30-11.86; p = 0.0120) under dominant (T/T or T/C vs. C/C) and codominant (T/C vs. T/T or C/C) models, respectively.
Predictive analysis results showed that the intronic variants c.388-83G > A, C and c.388-48A > G did not affect splicing (delta value= 0), whereas for the c.531 + 4G > A variant, there was an increase in the donor site and a delta value = −0.58, which might interfere with the recognition of the splice donor site of exon 4. On the other hand, PolyPhen-2 revealed that the missense variant Asp777Glu (c.2331C > G) was probably damaging, with a value of 0.985, indicating a functional impact on the protein E-cadherin; this variant was observed in two patients with reduced E-cadherin expression and one with normal expression. The HOPE program, which analyzes the structural and functional effect of a variant on the protein, showed that the secondary structure could be destabilized and be damaging.
DISCUSSION
The most well-known alteration associated with DGC development is loss of a functional CDH1 gene, whose product is the protein E-cadherin. The CDH1 gene is commonly inactivated by somatic mutations, methylation of promoters or enhancers, or chromosomal rearrangements; consequently, E-cadherin protein shows an abnormal distribution or is reduced or absent20-24. To date, we found in the literature 19 reports worldwide that have identified reduced E-cadherin expression in DGC patients using IHC (Table 1)6,25-42. The percentage of cases with reduced expression of this protein is in the range of 41.7-100%. In this study, we found that 56% of DGC patients had reduced E-cadherin protein; we compared our results only with those reported for other populations, where they used a scale similar to ours. Our results are similar to studies performed in Indian (50.0%, p = 0.729)31, Japanese (63.8%, p = 0.539)34, Korean (80.0%, p = 0.085)37, and Romanian populations (82.4%, p = 0.087)40; however, they are statistically different from results in Iranian (100%, p = 0.018)32 and Mexican (90%) populations6. These differences are probably due to the distinct types of scales used to evaluate E-cadherin expression; for instance, the categories normal, abnormal (atypical or heterogeneous), and negative were used in the previous Mexican study6, whereas in this work, the categories normal and reduced were used. The methodology used in the studies may be the primary source of heterogeneity in the results since the type of antibody and dilution is different in each study; therefore, the sensitivity of IHC may depend on this factor. In addition, the cutoff point that determines whether the expression of E-cadherin is preserved depends on the pathologist's criterion and usually varies between 5% and 90%43. Visual assessment of immunohistochemistry is subjective and has limited dynamic ranges in measurements since the nature of the visually obtained variables assessment is categorical, and multiple studies have demonstrated low inter- and intraobserver agreement44.
Digital pathology can help to eliminate bias when counting to assess immunohistochemistry markers and has been shown to increase significantly interobserver (between pathologists) and intraobserver (within the same pathologist) agreement45. Some algorithms for evaluating membrane-type stains such as E-cadherin and HER2 are based on color deconvolution and segmentation to distinguish specifically membrane decoration by the antibody. Algorithms with different approaches to immunohistochemistry quantification such as cell-based H-score, pixel-based H-score, and average threshold method have also been developed46. However, coordination and governance through international guidelines is also required for quantification with digital pathology methods. This involves the implementation of common worldwide algorithms for the quantification of immunohistochemistry markers, taking into account the reagents and methods used for staining and the use of robust algorithms for quantification.
Regarding CDH1 variants, it has been reported that up to 38% of sporadic DGC tumors have mutations in the CDH1 gene47. In our study, all patients had CDH1 variants, and a minimum of 3 and a maximum of 7 variants were identified. About 12 SNPs were identified, all of them previously reported in the SNP database, and none was related to reduced E-cadherin expression. This suggests the presence of other mechanisms to silence the gene in these patients, probably methylation, which has also been reported in the CDH1 gene as a cause of reduced E-cadherin expression11,48,49. Chi-square test50 was used to compare the frequencies of CDH1 variants c.388-83G > A, c.2076T > C, c.2164 + 17dupA, c.2253C > T, and c.2331C > G with those found in the MXL population (Mexican ancestry in Los Angeles, California) reported in the 1000 Genomes database18. Results showed an association with DGC (p < 0.05). Moreover, the variants c.2164 + 17dupA and c.2253C > T were linked with increased risks for DGC; therefore, it is important to continue studying these variants in our population, as well as in other populations. In the literature, some of these variants are reported in association with DGC. For instance, the variant c.2253C > T in Chinese population may contribute to a predisposition to GC in groups with a family history (OR = 3.19, 95% CI = 1.29-7.91)51, but its association is controversial, because, in Taiwanese and Iranian populations, no such association was observed52,53. For other variants, an association has not been shown, such as c.2076T > C p. Ala692 = rs180155252-54 and c.2164 + 17dupA55,56.
Seven of the 12 identified variants are reported in ClinVar8 as benign or probably benign (Table 2). The rs16260 variant was previously associated to risk of DGC in our population, with an OR of 1.98 (95% CI 1.01-3.98) for heterozygotes and 6.5 (95% CI 2.1-19.6) for homozygotes57, and the others are reported as having uncertain significance (c.531 + 4G > A and c.2331C > G) or have not been reported (c. −273G > A, c.388-83G > A,C, and c.388-48A > G), all of which were included in the predictive analysis. We reported in a previous study that allele A of variant c. −273G > A could modify the binding sites of the transcription factors IRF3 and SP258. Prediction for the intronic variants c.388-83G > A,C and c.388-48A > G with the NNSplice and NetGene2 programs showed that splicing would not be affected. Analysis of variant c.531 + 4G > A located in intron 4 indicated a possible loss of splice site, although in vitro and in vivo studies are needed. However, analysis in cell lines of a variant located at the adjacent site (c.531 + 3A > G) showed that splicing is not altered in the presence of the allele G59. Considering these results, it is probable that variant c.531 + 4G > A does not modify the splice site. On the other hand, the missense variant c.2331C > G (p.Asp777Glu) was predicted by our analysis to be damaging, as the secondary structure resulting from the change of aspartic acid to glutamic acid at residue 777 could destabilize the structure of the E-cadherin protein; moreover, the change of the medium (Asp) to large (Glu) side chain could probably interfere with the interaction with p120 catenin (amino acids 736-781), whose function is to stabilize and prevent the entry of E-cadherin into degrading endocytic pathways. Then, in the absence of p120 catenin, E-cadherin interacts with Hakai protein (binding sites at amino acids 734-798), promoting internalization, and targeting the plasma membrane or degrading E-cadherin60. Finally, all gastric tumors of DGC patients studied here had somatic CDH1 gene variants; however, the c.2164 + 17dupA, c.2253C > T, and c.2331C > G variants were importantly related to DGC.