Genome-wide DNA copy number analysis in pancreatic cancer using high-density single nucleotide polymorphism arrays (original) (raw)

Oncogene. Author manuscript; available in PMC 2008 Jul 31.

Published in final edited form as:

PMCID: PMC2492386

EMSID: UKMS1481

T Harada

1Centre for Molecular Oncology, Cancer Research UK, Institute of Cancer, Barts and The London School of Medicine and Dentistry, Queen Mary, University of London, London, UK

C Chelala

1Centre for Molecular Oncology, Cancer Research UK, Institute of Cancer, Barts and The London School of Medicine and Dentistry, Queen Mary, University of London, London, UK

V Bhakta

1Centre for Molecular Oncology, Cancer Research UK, Institute of Cancer, Barts and The London School of Medicine and Dentistry, Queen Mary, University of London, London, UK

T Chaplin

2Centre for Medical Oncology, Cancer Research UK, Institute of Cancer, Barts and The London School of Medicine and Dentistry, Queen Mary, University of London, London, UK

K Caulee

1Centre for Molecular Oncology, Cancer Research UK, Institute of Cancer, Barts and The London School of Medicine and Dentistry, Queen Mary, University of London, London, UK

P Baril

1Centre for Molecular Oncology, Cancer Research UK, Institute of Cancer, Barts and The London School of Medicine and Dentistry, Queen Mary, University of London, London, UK

BD Young

2Centre for Medical Oncology, Cancer Research UK, Institute of Cancer, Barts and The London School of Medicine and Dentistry, Queen Mary, University of London, London, UK

NR Lemoine

1Centre for Molecular Oncology, Cancer Research UK, Institute of Cancer, Barts and The London School of Medicine and Dentistry, Queen Mary, University of London, London, UK

1Centre for Molecular Oncology, Cancer Research UK, Institute of Cancer, Barts and The London School of Medicine and Dentistry, Queen Mary, University of London, London, UK

2Centre for Medical Oncology, Cancer Research UK, Institute of Cancer, Barts and The London School of Medicine and Dentistry, Queen Mary, University of London, London, UK

Correspondence: Professor NR Lemoine, Centre for Molecular Oncology, Cancer Research UK, Institute of Cancer, Barts and The London School of Medicine and Dentistry, Queen Mary, University of London, Charterhouse Square, London, EC1M 6BQ, UK. E-mail: ku.gro.recnac@eniomel.kcin

Supplementary Materials

Figure Legend.

GUID: B9980BDA-018B-4387-86A8-9CDC3F656A04

Suppl Figure 1.

GUID: E611B7B7-EBA2-4990-8596-202F37A54223

Suppl Table 1.

GUID: F7015002-6D56-4D80-AE4D-6BF6979A1F05

Suppl Table 2.

GUID: F67881F8-A6E1-4F1F-89CD-A62419ED1DB5

Suppl Table 3.

GUID: 7323BA82-F7B3-41FF-A0BB-3584805CDDB7

Suppl Table 4.

GUID: 36A64BE5-C1FC-4EDD-AC30-14E69709C3CB

Suppl Table 5.

GUID: A0DFC1E2-B7C5-43AB-B1FC-A09D13467DE8

Suppl Table 6.

GUID: 9917DE31-0C78-49FE-8195-372500AD4461

Abstract

To identify genomic abnormalities characteristic of pancreatic ductal adenocarcinoma (PDAC) in vivo, a panel of 27 microdissected PDAC specimens were analysed using high-density microarrays representing ∼116 000 single nucleotide polymorphism (SNP) loci. We detected frequent gains of 1q, 2, 3, 5, 7p, 8q, 11, 14q and 17q (≥78% of cases), and losses of 1p, 3p, 6, 9p, 13q, 14q, 17p and 18q (≥44%). Although the results were comparable with those from array CGH, regions of those genetic changes were defined more accurately by SNP arrays. Integrating the Ensembl public data, we have generated ‘gene’ copy number indices that facilitate the search for novel candidates involved in pancreatic carcinogenesis. Copy numbers in a subset of the genes were validated using quantitative real-time PCR. The SKAP2/SCAP2 gene (7p15.2), which belongs to the src family kinases, was most frequently (63%) amplified in our sample set and its recurrent overexpression (67%) was confirmed by reverse transcription–PCR. Furthermore, fluorescence in situ hybridization and in situ RNA hybridization analyses for this gene have demonstrated a significant correlation between DNA copy number and mRNA expression level in an independent sample set (P<0.001). These findings indicate that the dysregulation of SKAP2/SCAP2, which is mostly caused by its increased gene copy number, is likely to be associated with the development of PDAC.

Keywords: pancreatic cancer, tissue microdissection, SNP array, genetic alterations, SKAP2/SCAP2

Introduction

Pancreatic ductal adenocarcinoma (PDAC) is one of the most challenging malignancies facing oncologists today. Essentially, no conventional treatment has made a significant impact on the course of the disease, so that the prognosis for patients still remains dismal with a median survival of approximately 6 months from diagnosis and an overall 5-year survival of less than 5% (Jemal et al., 2005). There is an urgent need for innovative approaches to early diagnosis and specifically targeted therapies, and this will only be made possible by a comprehensive understanding of the molecular events that make this such an aggressively malignant tumour type.

Genomic alterations can contribute to the dysregulation of expression levels of oncogenes and tumour suppressor genes in cancer cells, the accumulation of which is correlated with tumour progression (Ried et al., 1999; Bardeesy and DePinho, 2002; Li et al., 2004; Lips et al., 2007). The introduction of genotyping by single nucleotide polymorphism (SNP) arrays has allowed genome-wide, high-resolution analysis of both DNA copy number (DCN) alterations and loss of heterozygosity (LOH) events in cancer cells (Lindblad-Toh et al., 2000; Bignell et al., 2004; Huang et al., 2004; Janne et al., 2004; Zhao et al., 2004, 2005; Midorikawa et al., 2006). High-density SNP microarrays permit highly accurate mapping of those genetic changes across the entire genome, providing a promising starting point for the identification of novel candidate genes affected by such genomic abnormalities. In the present study, to identify genomic changes characteristic of PDAC cells in vivo, we analysed a panel of 27 microdissected PDAC tissue samples using the Affymetrix 100 K SNP arrays. Based on the obtained DCN data, we found a novel candidate gene, SCAP2 (SKAP2 is the latest official gene symbol), which is located at the minimal overlapping region of 7p15.2 gain. The general applicability of this observation was prospectively validated in an independent sample set using fluorescence in situ hybridization (FISH) and in situ RNA hybridization (ISH).

Results

Genome-wide analysis of DCN alterations in PDAC cells in vivo

Genome-wide copy number analysis was performed in a total of 27 microdissected PDAC samples (Supplementary Table 1). The average genotype call rate was 96.3±3.4 and 97.0±2.3% in _Hin_dIII and _Xba_I 50K SNP arrays, respectively. All the raw data are available in the Gene Expression Omnibus (GEO) (http://www.ncbi.nlm.nih.gov/geo/) with the accession number GSE7130.

The Copy Number Analyzer for Affymetrix GeneChip (CNAG) analysis for all 27 PDAC samples identified chromosomal regions of both DCN alterations and LOH throughout the whole genome of PDACs. We summarize genetic abnormalities of all chromosomes in Figure 1. (All the details including published copy number variation (CNV) data are shown in Supplementary Figure 1) The most frequent genetic gain was detected on 8q in our sample set (26 out of 27 cases; 96%). Gains of 1q, 2, 3, 5, 7p, 11, 14q and 17q were also identified at a high frequency (≥78% of cases). On the other hand, the most recurrent genetic loss was observed on 9p in 21 out of 27 cases (78%), followed by 18q, 6, 1p, 13q, 17p, 3p and 14q (≥44% of cases). Overall, the spectrum patterns of genetic changes identified by SNP arrays were similar to our previous results from both metaphase and array-based comparative genomic hybridization (array CGH) (Harada et al., 2002a, b, 2007). However, with the increased resolution, SNP arrays delineated more precise physical boundaries of chromosomal breakpoints in PDAC (Figure 2). As a result, we have identified homozygous deletions at 9p21.3 (45 kb) and high-level amplifications in three regions of 8q: 8q24.13–q24.21 (2.2 Mb), 8q24.22 (177 kb) and 8q24.23–q24.3 (2.7 Mb) (≥19% of cases). These were detected as minimal regions of frequent genetic alterations and therefore, considered to be epicentres in those changes (Supplementary Figure 1).

An external file that holds a picture, illustration, etc. Object name is ukmss-1481-f0001.jpg

The overview of genomic changes of all chromosomes in a total of 27 microdissected PDAC tissues, determined by the Affymetrix 100 K SNP arrays. (see Supplementary Figure 1 for all the details) ‘Genetic gains’ are shown as green bars and ‘losses’ as red bars according to the genomic position (Build 35). Thick bars are used to depict ‘high-level amplifications’ and ‘homozygous deletions’. Blue bars indicate ‘LOH regions’ and grey, thick bars are used for ‘UPD regions’.

An external file that holds a picture, illustration, etc. Object name is ukmss-1481-f0002.jpg

(a) Comparison of the results between CGH and SNP arrays. Upper: Chromosome 18 in PC16, determined by array CGH (see ref. Harada et al., 2007). The blue line is used to depict the smoothed DCN values. Lower: The same sample was analysed by SNP arrays, showing more distinct physical boundaries that enabled to identify small size of DCN alterations (asterisks). (b) ‘High-level amplification’ detected by SNP arrays (chromosome 11 in PC33). This amplicon size was approximately 1.35 Mb.

DCN alterations in individual genes across the entire genome

By integrating the Ensembl public data with our DCN data, we sought to identify ‘gene’ copy numbers throughout the entire genome of PDAC. We have documented the genes included in regions of frequent genetic changes in Table 1 and the complete list of copy numbers in all the genes is provided in Supplementary Table 2. These gene copy number indices enabled comparison of our results with the previously published data. SCAP2 (SKAP2, 7p15.2) was identified as the most frequently (63% of cases) amplified gene in 27 PDACs, which has not been described in any type of cancer. We also detected increased copy numbers in MYC (8q24.21, 48%), NCOA3/AIB1 (20q13.12, 44%), KRAS (12p12.1, 44%), ERBB2 (17q12, 41%) and EGFR (7p11.2, 33%) genes. On the other hand, two tumour suppressor genes, CDKN2A and CDKN2B, were included in the locus of 9p21.3 that was deleted at the highest frequency (63% of cases). Genetic losses were also detected in genes such as DCC (18q21.1, 48%), SMAD4 (18q21.1, 33%), MAP2K4 (17p12, 30%), TP53 (17p13.1, 26%) and RUNX3 (1p36.11, 22%). Thus, recurrent genetic changes that were previously characterized in PDAC were commonly observed in our analysis (Bardeesy and DePinho, 2002; Li et al., 2004).

Table 1

Frequent gene copy numbers identified by SNP arrays

(A) Increased gene copy numbers
Cytoband Start (bp) End (bp) Gene symbol Ensembl ID _Total_a _DCN = 3 or 4_b _DCN_≥_5_c
7p15.2 26479930 26677581 SCAP2 ENSG00000005020 18 17 1
8q24.23 139332265 139449408 NULL ENSG00000169438 18 12 6
1q21.1 142189379 142565232 PDE4DIP ENSG00000178104 17 17 0
1q21.1 142433716 142433772 NULL ENSG00000190646 17 17 0
1q21.1 142433716 142433860 U2 ENSG00000201685 17 17 0
1q21.1 142478353 142480409 Q9H762_HUMAN ENSG00000168681 17 17 0
1q21.1 142585472 142606033 SEC22L1 ENSG00000155878 17 17 0
1q21.1 142628069 142628613 NULL ENSG00000177144 17 17 0
1q21.1 142638784 142638882 U6 ENSG00000201789 17 17 0
1q23.3 159333913 159481893 DDR2 ENSG00000162733 17 17 0
1q23.3 160210932 160211020 NULL ENSG00000193661 17 17 0
1q25.1 172023592 172107588 TNR ENSG00000116147 17 12 5
1q25.2 173163964 173543625 PAPPA2 ENSG00000116183 17 17 0
1q25.2 173317061 173317576 NULL ENSG00000172760 17 17 0
1q25.2 173561861 173865681 ASTN ENSG00000152092 17 17 0
1q25.2 173730156 173730238 hsa-mir-488 ENSG00000202609 17 17 0
1q25.2 173872188 173983209 FAM5B ENSG00000198797 17 17 0
1q25.3 179724210 179846384 LAMC1 ENSG00000135862 17 17 0
1q25.3 179887056 179945696 LAMC2 ENSG00000058085 17 16 1
1q25.3 179949035 180119394 NMNAT2 ENSG00000157064 17 16 1
1q25.3 180173291 180254982 SMG7_HUMAN ENSG00000116698 17 17 0
3q22.1 132736261 133241863 CPNE4 ENSG00000196353 17 16 1
7p15.2 25762765 25762863 mmu-mir-148a ENSG00000199085 17 17 0
7p15.2 25762779 25762846 NULL ENSG00000192920 17 17 0
8q21.11 76482826 76641614 HNF4G ENSG00000164749 17 14 3
8q24.12 120469769 120469890 NULL ENSG00000192356 17 14 3
8q24.12 120469769 120469890 snoACA32 ENSG00000199918 17 14 3
2q35 216050687 216126302 FN1 ENSG00000115414 13 12 1
8q24.21 128817686 128822853 MYC ENSG00000136997 13 9 4
20q13.12 45645347 45715724 NCOA3 ENSG00000124151 12 12 0
12p12.1 25249447 25295121 KRAS ENSG00000133703 12 11 1
17q12 35109920 35138436 ERBB2 ENSG00000141736 11 11 0
7p11.2 54860934 55049239 EGFR ENSG00000146648 9 9 0
8q12.1 61592113 61696183 RAB2 ENSG00000104388 9 6 3
(B) Decreased gene copy numbers
Cytoband Start (bp) End (bp) Gene symbol Ensembl ID _Total_d _DCN = 1_e _DCN = 0_f
9p21.3 21957751 21984490 CDKN2A ENSG00000147889 17 12 5
9p21.3 21992909 21999312 CDKN2B ENSG00000147883 17 12 5
9p21.3 22002115 22002528 NULL ENSG00000187088 17 12 5
9p21.3 21957137 21957738 NSGX_HUMAN ENSG00000173515 16 11 5
9p21.3 21792635 21921198 MTAP ENSG00000099810 14 10 4
9p21.3 22436840 22442472 DMRTA1 ENSG00000176399 14 11 3
18q21.33 58005521 58125333 KIAA1468 ENSG00000134444 14 14 0
18q21.33 58143566 58204482 TNFRSF11A ENSG00000141655 14 14 0
18q21.33 58149025 58149121 NULL ENSG00000191838 14 14 0
18q21.33 58149025 58149125 Y ENSG00000199867 14 14 0
18q21.33 58262420 58262484 NULL ENSG00000193871 14 14 0
18q21.33 58468521 58468807 NULL ENSG00000192667 14 14 0
18q21.33 58534939 58798645 PHLPP ENSG00000081913 14 14 0
18q21.33 58549439 58549545 NULL ENSG00000193542 14 14 0
18q21.33 58549439 58549545 U6 ENSG00000199195 14 14 0
18q21.33 58656620 58656928 Q9H380_HUMAN ENSG00000171825 14 14 0
18q21.33 58941559 59137489 BCL2 ENSG00000171791 14 14 0
18q21.33 59148813 59185465 FVT1 ENSG00000119537 14 14 0
18q21.33 59207407 59240673 VPS4B ENSG00000119541 14 14 0
18q21.33 59315202 59315613 NULL ENSG00000176042 14 14 0
18q22.3 69563150 69563445 NULL ENSG00000191507 14 14 0
18q22.3 69891581 69966020 FBXO15 ENSG00000141665 14 14 0
18q22.3 69966766 69977000 TI21L_HUMAN ENSG00000075336 14 14 0
18q22.3 70045721 70046017 NULL ENSG00000192188 14 14 0
18q22.3 70071512 70110155 CYB5 ENSG00000166347 14 14 0
18q21.1 48121156 49311021 DCC ENSG00000187323 13 12 1
18q21.1 46810611 46860139 SMAD4 ENSG00000141646 9 7 2
17p12 11864929 11987872 MAP2K4 ENSG00000065559 8 7 1
17p13.1 7512464 7531642 TP53 ENSG00000141510 7 7 0
1p36.11 24971364 25036918 RUNX3 ENSG00000020633 6 6 0

Verification of gene copy numbers by quantitative real-time PCR (q-PCR)

To validate gene copy numbers identified by SNP arrays, we performed q-PCR for a subset of candidate genes (FN1, SCAP2, RAB2 and CDKN2A) using genomic DNA from 19 microdissected PDACs (Supplementary Table 3). Due to very limited amounts of microdissected tumour DNA, eight out of 27 samples were not available in q-PCR analysis. We chose the HAND1 gene (5q33.2) as a reference gene in this assay because there were no DCN changes detected at the locus of this gene in SNP array analysis of these 19 cases. In general, inferred DCNs were concordant in SNP arrays and q-PCR (Figure 3). Although the absolute values of DCN were different between two analyses in some samples, a strong correlation was observed between two data sets, with a Spearman's correlation coefficient r=0.72. These results have demonstrated the overall validity of the DCN status determined by genotyping-based microarrays.

An external file that holds a picture, illustration, etc. Object name is ukmss-1481-f0003.jpg

Comparison of DCN alterations identified by SNP arrays (blue columns) and q-PCR (violet columns) analyses. A significant correlation (_r_=0.72) was observed between two data sets. All the data are shown in Supplementary Table 3.

Genome-wide detection of LOH and uniparental disomy (UPD) regions

A total of 579 LOH regions (2–70 regions per case, 0.78–174 Mb in size) were detected in 27 PDAC tissues (Figure 1). The frequent LOH regions were observed in various chromosome arms; 4q (63% of cases), 18q (63%), 9p (56%), 6p (56%), 6q (56%), 8p (56%), 2q (52%), 1p (48%), 5q (48%), 7q (48%) and 3p (44%) (Supplementary Figure 1). These results were consistent with the previous reports (Iacobuzio-Donahue et al., 2004; Calhoun et al., 2006). Combining these with the DCN data, we identified a total of 223 UPD regions (1–23 region(s) per case, 1–64 Mb in size) in 27 PDAC cases and therefore, 39% of LOH regions were considered to be UPD. Remarkably, common UPD regions (4 out of 27 cases; 15%) were preferentially identified in only three chromosome loci: 4q22.3–q23 (2.8 Mb), 4q31.21–q31.23 (2.0 Mb) and 18q21.1 (1.3 Mb) (Supplementary Table 4).

Screening for genes within the minimal amplicon at 7p15.2 region by reverse transcription–PCR

SNP arrays enabled us to narrow down the minimal overlapping regions of DCN alterations, which are likely to contain critical oncogenes or tumour suppressor genes. In this study, we identified frequent gains on 7p in 27 PDAC cases (78%). Although 7p copy number gain has been detected as a large region, our array results revealed the minimal region at 7p15.2, with ≥3 copies found in 59–63% of cases (Figure 4A). According to the Ensembl database, this region is approximately 1 Mb in size and includes five known genes (NFE2L3, HNRPA2B1, CBX3, SNX10 and SCAP2). This region was not involved in CNV regions and therefore, thought to be ‘acquired’ alteration (Supplementary Figure 1) (Redon et al., 2006). We postulated that some of these genes may be incidentally coamplified due to the ongoing genomic instability, along with the target gene(s). To screen for the gene(s) whose DCN alteration(s) could lead to significant change(s) in transcript level, reverse transcription–PCR (RT–PCR) was performed for five candidate genes within this amplicon (Figure 4B). The transcripts of four genes (NFE2L3, HNRPA2B1, CBX3 and SNX10) were expressed in normal pancreas tissues, whereas various levels of expression were observed in PDAC cases. In contrast, SCAP2 was the only gene that was not expressed in normal tissues, but it was upregulated in eight out of 12 PDAC cases (67%). These results raised the possibility that genetic gain of this region could be caused solely by the SCAP2 gene.

An external file that holds a picture, illustration, etc. Object name is ukmss-1481-f0004.jpg

(A) The minimal common region (7p15.2) of 7p gain, defined by DCN analysis of 27 PDAC cases. The green bars below the chromosome define the region of copy number gain in each case analysed in this study. The approximately 1 Mb region, which was most frequently gained in our sample set, was surrounded by black dots. The genes contained in the minimal region at 7p15.2 are listed below. SNP markers spotted on Affymetrix 100 K arrays are shown at the bottom. (B) Transcript levels of five candidate genes within the minimal region in normal pancreas and PDAC tissues, determined by RT–PCR. Samples were run in the following order: lane 1–2, normal pancreas; lane 3–14, PDACs; lane 15, negative control. Among five genes, the only SCAP2 transcript was not detectable in normal pancreas tissues, whereas it was upregulated in eight out of 12 PDAC tissues (67%). (C) The FISH results in two representative cases. (a) No copy number change of SCAP2 in PC58 and (b) a genetic gain in PC63. Original magnification: × 1000. (D) SCAP2 mRNA expression in normal epithelial cells and PDAC cells, determined by ISH. (a) No signals (score 0) were detected in ductal (arrow) and islet cells (arrow heads) of normal pancreas, whereas very subtle, patchy signals were observed in acinar cells. (b) ISH conducted with a sense SCAP2 riboprobe, used as a negative control. Arrow indicates ductal cells of normal pancreas. (c) Strong signal (score 2) in moderately differentiated PDAC cells (PC53). (d) Higher level of expression (score 1) in metastatic PDAC cells (PC66) compared to normal hepatic cells (asterisk). Arrows indicate two micrometastatic lesions. Original magnification: (a and b), × 400; (c), × 100; (d), × 40.

A strong correlation between DCN and mRNA expression level of SCAP2

To validate the high frequency of SCAP2 gain, interphase FISH analysis was applied to an independent sample set that consists of 92 PDAC cases (see Materials and methods). However, we used 36 cases for the analysis because the remaining 56 specimens were not interpretable probably due to formalin overfixation of tissues (Table 2). DCN of the SCAP2 gene was increased in 19 out of 36 PDAC tissues (53%), which was a similar frequency to the DCN data by SNP arrays (Figure 4C). Next, to evaluate further the frequency of SCAP2 overexpression, ISH was carried out using all PDAC tissues except for one damaged tissue (Figure 4D). In normal pancreas and liver tissues, there was no SCAP2 expression detected in ductal, islet and hepatic cells, whereas very subtle, patchy signals were observed in acinar cells. In contrast, the SCAP2 transcript was upregulated (score 1–2) in 62 out of 91 PDAC cases (68%), which was in good accordance with the RT–PCR results (Supplementary Table 5). Moreover, SCAP2 overexpression was observed consistently from early-(I–II) to late-stage (III–IVb) tumours, indicating that this gene may be involved in the development of PDAC. Finally, in order to assess directly whether an increased gene copy number of SCAP2 is associated with overexpression of its transcript, we compared the FISH and ISH results from 36 PDAC tissues that were commonly used for both analyses (Table 2). The SCAP2 transcript was upregulated in all 19 cases with genetic gain of this gene. Despite an increase in transcript level, there were no DCN changes of SCAP2 observed in seven samples (No. 39, 47, 50, 57, 61, 64 and 84 in Table 2). Notably, we found a significant correlation between gene copy number and expression level of this gene (P<0.001, Fisher's exact test).

Table 2

FISH and ISH results in an independent sample set of 36 PDAC tissues

No. Sample Age Sex _Histology_a _Stage_b _FISH_c _ISH_d
3 PA801/C3 30 F Mod I 1.17 1
7 PA801/D5 49 M Mod I 0.91 0
14 PA801/E6 65 M Mod I 0.98 0
15 PA801/G7 30 M Mod I 1.43 1
18 PA801/F6 40 F Poor I 1.41 2
19 PC56 57 F Mod II 1.07 0
28 PC41 69 F Well III 0.93 0
29 PC57 62 F Well III 1.34 1
30 PC40 58 M Mod III 1.30 2
31 PC44 59 M Mod III 1.52 2
32 PC58 59 M Mod III 0.96 0
36 PC42 51 M Poor III 1.18 1
39 PC46 53 M Mod IVa 1.13 1
40 PC48 57 F Mod IVb 0.94 0
41 PC49 65 F Mod IVb 1.24 1
42 PC51 61 M Mod IVb 1.26 1
43 PC63 51 M Mod IVb 1.37 2
44 PC65 69 M Mod IVb 1.00 0
47 PC53 57 F Mod IVb 1.14 2
48 PC54 61 M Mod IVb 1.50 1
49 PC66 69 M Mod IVb 1.22 1
50 PC52 74 M Poor IVb 0.97 1
51 PC59 67 M Poor IVb 1.04 0
52 PC61 76 M Poor IVb 1.20 2
53 PC67 55 F Poor IVb 1.28 2
55 PC55 74 M Poor IVb 1.50 1
56 PC60 67 M Poor IVb 1.11 0
57 PC62 76 M Poor IVb 1.00 1
58 PC68 55 F Poor IVb 1.41 2
61 PA801/B8 65 M Well NA 1.09 1
63 PA801/A2 73 F Mod NA 0.97 0
64 PA801/A4 53 F Mod NA 0.96 1
74 PA801/C2 57 M Mod NA 1.21 1
81 PA801/C8 68 F Poor NA 1.20 2
84 PA801/F8 50 M Poor NA 0.90 1
87 PA801/G1 62 M Poor NA 1.33 2

Discussion

It is well known that PDAC tissues are especially heterogeneous, with neoplastic cells constituting only a small proportion of the tumour mass. Previous studies have shown that the quality of the data from SNP arrays is highly dependent on tumour purity: Up to 20% of contamination with non-neoplastic cells should be acceptable for the detection of genomic abnormalities, whereas more than 30% of contamination resulted in a significant reduction of the sensitivity of the analysis (Lindblad-Toh et al., 2000; Huang et al., 2004; Zhao et al., 2004). Tissue microdissection enables us to collect purified populations of cancer cells by trimming out dense stromal components (Harada et al., 2002a, 2007; Andersen et al., 2007). In the present study, we applied only microdissected samples to Affymetrix 100 K SNP arrays, which was crucial to identify genetic changes that reflect the intrinsic characteristics of PDAC cells in vivo at a high sensitivity. In addition, integrating the Ensembl public data with our results, we identified copy number changes in individual genes across the entire genome (Supplementary Table 2). The ‘gene’ copy number indices we provide here facilitate the search for novel candidate genes that may be useful as potential diagnostic and prognostic markers. Copy numbers in a subset of the genes were then validated by q-PCR to prove the relevance of our DCN data from SNP arrays.

Analysing matched normal DNA in parallel with tumour DNA, SNP arrays can be a powerful tool for the genome-wide detection of LOH/UPD regions because we could completely exclude consanguinity as a cause of homozygosity (Fitzgibbon et al., 2005; Raghavan et al., 2005; Andersen et al., 2007). However, the use of microdissected tumour samples is thought to be advantageous for more accurate determination of LOH regions in case constitutive DNA is not available (Yamamoto et al., 2007). We found that approximately 40% of LOH regions were considered to be UPD in PDAC tissues, whereas a previous study has shown that UPD was detected in almost 50% of LOH regions using 26 PDAC-derived cell lines (Calhoun et al., 2006). Accordingly, this copy-neutral event is likely to occur frequently in PDAC cells in vivo as well as in vitro. Interestingly, we found that common UPD regions were preferentially observed in only three chromosome loci, indicating that these UPD events may be nonrandom. Our previous study has shown that UPD in acute myeloid leukaemia can result in homozygosity for gene mutations (Fitzgibbon et al., 2005). Therefore, we speculate that common UPD regions in PDAC may harbour genes that are targets for somatic mutations (Supplementary Table 4).

Chromosome arm 7p has long been suspected to include critical oncogenes in PDAC (Griffin et al., 1995; Harada et al., 2002a; Karhu et al., 2006). Our SNP array analysis has revealed the minimal common region (7p15.2) of 7p gain, which was most frequently (approximately 60%) identified in 27 PDAC cases (Figure 4A). Among five known genes included in this region, SCAP2 was the only gene to show the possibility of a causal relationship between gene dosage and mRNA expression level. Therefore, we hypothesized that SCAP2 may potentially be the ‘driver’ gene in a frequently observed gain of this 1 Mb region.

In order to evaluate this hypothesis, we performed FISH and ISH analyses on the identical tissue specimens, which allowed for the direct comparison between DNA and mRNA status of SCAP2 in vivo. Our FISH results have shown a genetic gain of SCAP2 in 53% of another population of 36 PDAC cases, which was a similar frequency to those from SNP arrays (63%). These findings demonstrate that an increased DCN of SCAP2 could be a prevalent genetic change in the majority of PDAC cases. More importantly, there was a significant correlation (P<0.001) between genetic gain and overexpression of the SCAP2 gene, suggesting that the transcript level of this gene may be regulated by its DNA copy number. High-level amplification of SCAP2 was not frequent in our sample set, but it is known that even a low-level copy number change can have a significant effect on gene expression in cancer (Chin et al., 2006; Saramaki et al., 2006).

SCAP2 was first identified as a cancer-related gene by Buchholz et al., showing that this gene is overexpressed in pancreatic intraepithelial neoplasia (PanIN) lesions (Buchholz et al., 2005). The present study has shown that SCAP2 is upregulated in PDAC tissues at a high frequency (67 and 68% in RT–PCR and ISH, respectively). Furthermore, we found that a genetic gain of SCAP2 was frequently observed in PDAC tissues using both CGH and SNP arrays as well as FISH (Harada et al., 2007). Taken together, the dysregulation of SCAP2, which is mostly caused by its increased gene copy number, is likely to be associated with the development of PDAC. SCAP2 was originally described as a substrate for the src family kinases able to regulate the phosphorylation of the presynaptic protein α-Synuclein; a process considered to be crucial in the pathogenesis of several neurodegenerative diseases including Parkinson's disease (Takahashi et al., 2003). Although SCAP2 structurally seems to be an adaptor protein with a SH3 domain providing a binding site for the focal adhesion kinase RAFTK, the biological role of this gene is still unknown. However, based on the ability of SCAP2 to interact with RAFTK, it is likely to modulate the spreading and motility of PDAC cells (McLean et al., 2005). Interestingly, a recent study has demonstrated that overexpression of α-Synuclein in osteosarcoma cells results in a prolonged G1 phase of the cell cycle and initiates tumour differentiation (Fujita et al., 2006). Therefore, we speculate that SCAP2 could also control the growth and differentiation of PDAC cells. This hypothesis is supported by the data that SCAP2 is overexpressed in PanIN lesions as well as in PDACs. Based on these observations, we propose that SCAP2 could be a potential marker gene for early diagnosis and a possible target for therapeutic intervention in PDAC.

Materials and methods

Tumour and reference samples

A total of 27 fresh-frozen PDAC tissue specimens were obtained surgically or at autopsy from Yamaguchi University School of Medicine, Japan (Supplementary Table 1). Tissue microdissection was manually performed to collect tumour cells at more than 90% purity in all cases, and DNA was extracted as described previously (Harada et al., 2002a). Reference DNA was obtained from lymphocytes of a total of 13 healthy volunteers (6 males and 7 females) whose ethnicity is anonymous. These tumour and reference genomic DNAs were used for both SNP array and q-PCR analyses.

Another series of 12 fresh-frozen PDAC and two normal pancreas tissues were obtained from the Human Biomaterials Resource Centre, Department of Histopathology, Charing Cross Hospital, London, UK, and were subjected to RT–PCR. The clinicopathological information was not available for these anonymous samples. In addition, an independent sample set of 92 formalin-fixed, paraffin-embedded PDAC tissue sections (4–5 μm thickness) were prepared for FISH and ISH analyses: 25 samples from Yamaguchi University and 67 samples from a tissue microarray (US Biomax Inc., Rockville, MD, USA; http://www.biomax.us/tissue-arrays/Pancreas/PA801). All the clinical samples were obtained with ethical approval of the host institutions. This study was approved by a Research Ethical Committee (REC) with the REC Project Number 05/80408/6S.

Affymetrix 100 K SNP arrays and data analysis

We used the GeneChip Human Mapping 100 K Set (Affymetrix, Santa Clara, CA, USA) that is composed of two 50 K arrays (_Hin_d240 and _Xba_240). DNA digestion, labelling and hybridization were performed according to the manufacturer's instructions as described elsewhere (Zhao et al., 2005). The raw images were analysed using the GCOS (Ver1.4) and GTYPE (Ver4.1) software (Affymetrix).

To assess DCN alterations and LOH, we used the CNAG (Ver2.0) software (http://www.genome.umin.jp/) (Nannya et al., 2005). We chose ‘automatic analysis’ mode in which the software performs pair-wise tests using all the references. ‘Genetic gains’ (DCN≥3) and ‘losses’ (DCN≤1) were defined according to working criteria of CNAG. In addition, ‘high-level amplifications’ and ‘homozygous deletions’ were determined to be DCN≥5 and DCN=0, respectively. In order to avoid false-positive changes due to random noise in signal intensity at each SNP, we set a minimum physical length of at least five consecutive SNPs for putative genetic alterations. The LOH data from two 50 K chips were analysed separately and both results were then merged to determine LOH regions. Chromosome X was not analysed to avoid gender-related complications (Andersen et al., 2007). UPD regions were defined as LOH regions without DCN alterations.

Gene annotation and data integration

The physical position of all SNPs (_n_=116 204) on the arrays was mapped according to the Human Genome Sequence (Build 35). We developed our own visualization software to merge all genetic aberrations with the gene annotation from the Ensembl Ver.37 (the most updated version based on Build 35) (http://www.ensembl.org) public database. Taking structural variation in the human genome into account, our software integrated the public data of CNVs (http://projects.tcag.ca/variation/) into our analysis (Iafrate et al., 2004; Redon et al., 2006).

DCN alterations determined by q-PCR

Relative gene copy numbers were determined by q-PCR using the ABI PRISM 7500 Sequence Detection System (Applied Biosystems, Cheshire, UK) and a TaqMan Universal PCR Master Mix kit (Applied Biosystems), as described previously, with minor modifications (Zhao et al., 2004, 2005). PCR condition as well as primers and TaqMan MGB probes sequence for five genes (FN1, SCAP2, RAB2, CDKN2A and HAND1) are available in Supplementary Table 6. PCR reactions were performed in triplicate for each primer set and the data were analysed using the Sequence Detection Software Ver1.3 (Applied Biosystems). Each tumour DNA was quantified by comparing the target locus to the reference locus of the HAND1 gene (5q33.2), for which DCN of 2 was confirmed by SNP arrays in all 19 samples analysed. The standard curve method was used to calculate target gene copy number in the tumour DNA sample normalized to the reference gene (HAND1) and calibrated to normal DNA.

Reverse transcription–PCR

RT–PCR was performed using 12 PDAC and two normal pancreas tissues, as described previously (Harada et al., 2007). cDNAs were synthesized from 1 μg of total RNA using an oligo (dT) primer. Reverse transcription was followed by 30 PCR cycles to amplify a cDNA fragment of each of five genes (NFE2L3, HNRPA2B1, CBX3, SNX10 and SCAP2). 18S rRNA was used as an endogenous control. All the primer sets used are shown in Supplementary Table 6. Amplified products were separated on 1% agarose gels and visualized with ethidium bromide.

Two-colour FISH for SCAP2

Two-colour FISH was performed as described previously (Harada et al., 2007). The target probe was prepared from the BAC clone, RP11-232C20 (BACPAC Resources, Oakland, CA, USA) including the SCAP2 gene and its DNA was labelled with Cy3-dCTP. Centromeric probes for chromosomes 7 (CEP7) labelled with SpectrumGreen were purchased from Vysis (Downers Grove, IL, USA). The threshold of gain and loss was defined as ratios (BAC/centromeric probes) of ≥1.16 and ≤0.87, respectively, corresponding to ±2 standard deviation (see Harada et al., 2007 for more details).

ISH for SCAP2

The SCAP2 probe was amplified by PCR from OriGene clone TC117697 (OriGene Technologies Inc., Rockville, MD, USA) that encodes full-length cDNA of SCAP2. The primers used to amplify SCAP2 product are the same as those used in RT–PCR (Supplementary Table 6). The PCR product was cloned into the pCR4-TOPO vector using the TOPO cloning kit (Invitrogen Ltd, Paisley, UK) to create pCR4-_SCAP2_-ISH. Positive clones were verified by sequence analysis. Riboprobes were synthesized from 1 μg of linearized pCR4-_SCAP2_-ISH plasmid DNA and digoxigenin were labelled using a DIG RNA labelling kit (Roche Diagnostics, Mannheim, Germany). T3 and T7 polymerases were used to synthesize anti-sense and sense probes, respectively. Riboprobes for SCAP2 (30 ng per case) were hybridized to PDAC tissues using the Ventana Discovery System (Ventana Medical Systems, Tucson, AZ, USA). SCAP2 mRNA expression was judged using a 0–2 score (0=no signal, 1=weak intensity, 2=strong intensity).

Supplementary Material

Figure Legend

Suppl Figure 1

Suppl Table 1

Suppl Table 2

Suppl Table 3

Suppl Table 4

Suppl Table 5

Suppl Table 6

Acknowledgements

This work was supported by Cancer Research UK (C355/A6254). We thank Professor Kiwamu Okita (Department of Gastroenterology and Hepatology, Yamaguchi University School of Medicine) for providing clinical samples of PDAC.

Abbreviations

array CGH array-based comparative genomic hybridization
DCN DNA copy number
FISH fluorescence in situ hybridization
ISH in situ RNA hybridization
LOH loss of heterozygosity
PDAC pancreatic ductal adenocarcinoma
q-PCR quantitative real-time PCR
RT–PCR reverse transcription–PCR
SNP single nucleotide polymorphism
UPD uniparental disomy

Footnotes

References