Cellular and genetic diversity in the progression of in situ human breast carcinomas to an invasive phenotype (original) (raw)

Combined measurement of phenotypic and genetic diversity at single-cell resolution. We previously characterized stem cell–like CD44+ and more-differentiated CD24+ breast cancer cells from multiple tumors and determined that even within the same tumor, the 2 cell populations have distinct molecular and functional properties (10). These discrete characteristics are in part determined by epigenetic programs that might change during tumor progression (10, 12, 13). We also found evidence for genetic divergence between CD44+ and CD24+ breast cancer cells in one short-term primary culture derived from a pleural effusion sample (10).

To further investigate intratumor genetic and phenotypic heterogeneity in relation to stem cell–like and more-differentiated cell characteristics during progression from in situ to invasive breast carcinoma, we performed combined immunofluorescence staining and FISH (iFISH) (14) analyses of 15 invasive breast tumors of different subtypes containing both in situ and invasive components in the same section (Supplemental Table 1; supplemental material available online with this article; doi:10.1172/JCI40724DS1). In iFISH, immunofluorescence staining and FISH are used to define variability for phenotypic traits and copy number alterations, respectively. Six of the tumors were HER2+, 4 luminal A, and 5 basal-like, as defined by immunohistochemical analyses of estrogen and progesterone receptors (ER and PR), HER2, CK5/6, and EGFR (15). Stem cell–like and more-differentiated breast cancer cells were categorized based on positivity for the CD44 and CD24 cell surface markers, respectively.

First, we analyzed the tumors for the expression of CD24 and CD44 by immunohistochemistry to ensure that both cell populations were well represented on the slides to be used for iFISH. We observed high variability for the expression of these 2 markers both among and within tumors. Consistent with our prior studies (16), CD24+ breast cancer cells were infrequently detected in basal-like tumors, whereas the frequency of CD44+ cells was highest in basal-like and lowest in HER2+ cases. Thus, in basal-like tumors, we categorized breast cancer cells as CD44– and CD44+ populations. The frequency of CD24+ and CD44+ tumor cells was also highly variable in different regions of the same tumor (Supplemental Figure 1). To quantitatively assess this variability, we measured topological diversity of the tumor subtypes and histologies by determining the frequency of CD44+ and CD24+ cells in 4 independent quadrants of each tumor. Despite the high variability in the frequencies of these 2 cell types within some tumors (Supplemental Table 2 and Supplemental Figure 2), no significant differences were detected when the ratio of CD24+ to CD44+ cell frequencies was compared across all 15 cases (P = 0.91).

Next, we performed interphase iFISH analyses using 3 BAC probes localized to different chromosomal regions for each tumor. BAC clones were selected for each tumor subtype corresponding to commonly gained regions based on our previous SNP array studies (17) (Supplemental Table 3). Variability for chromosome 8q24 copy number was evaluated in all tumors, since this locus is often altered in all breast tumor subtypes. Basal-like tumors were also analyzed for chromosome 12p13 and 10p13, luminal A tumors for 11q13 and 16p13, and HER2+ tumors for 17q21 (ERBB2) and 1q32. Each pair of probes (BAC and corresponding centromeric probe) was evaluated individually using serial sections in each tumor, and the ratio of BAC to centromeric probe was determined and used for further calculations. Visual inspection of the iFISH images demonstrated variable copy numbers in different areas of some tumors (Figures 1 and 2, Supplemental Figure 3). In a luminal A tumor (tumor 10), clear evidence of clonal evolution during the in situ–to–invasive breast carcinoma transition was detected, as a CD24+CD44+ subclone with high 11q13 copy number gain in ductal carcinoma in situ (DCIS) became the dominant clone in the invasive areas (Figure 2). Interestingly, this 11q13 BAC includes the CCND1 gene encoding for cyclin D1; thus, amplification of this locus might explain the significantly higher rate of cellular proliferation (defined by the percentage of cells positive for the Ki67 marker) seen in the invasive (30% Ki67+ cells) compared with the in situ (2% Ki67+ cells) areas of this tumor (Supplemental Table 1).

Cellular and genetic diversity in breast cancer defined by iFISH analysis.Figure 1

Cellular and genetic diversity in breast cancer defined by iFISH analysis. A representative example (tumor 2) of HER2+ invasive ductal breast carcinoma with adjacent DCIS displaying a high degree of diversity for the expression of HER2, CD44, and CD24 and for copy number gain of ERBB2 and 8q24 based on immunohistochemical staining and iFISH, respectively. CD24 showed membrano-cytoplasmic expression in invasive tumor cells but apical membranous expression in DCIS. In iFISH, blue corresponds to CD24 or CD44 immunofluorescence; ERBB2 and 8q24-specific probes are red; and centromeric probes (chromosomes 17 and 8 for ERBB2 and 8q24, respectively) are green. Faint green and yellow are background autofluorescence. Scale bars: 10 μm; original magnification, ×400 (immunohistochemistry) and ×600 (iFISH).

Clonal evolution during in situ to invasive breast carcinoma progression deFigure 2

Clonal evolution during in situ to invasive breast carcinoma progression detected by iFISH. iFISH analyses using 11q13/CCDN1 (red) and chromosome 11 centromeric probe (green) in a luminal A subtype breast cancer (tumor 10). In the invasive areas, both CD44+ and CD24+ tumor cells (blue) display high-level amplification, whereas in adjacent DCIS, this is restricted to a subset of CD24+CD44+ tumor cells (dotted line), with the majority of the tumor demonstrating normal copy number for this locus. iFISH analysis of adjacent sections using 8q24 (red) and chromosome 8 centromeric probe (green) demonstrates normal (2n) copy numbers for 8q24 in both DCIS and invasive areas. Faint green and yellow are background autofluorescence. Yellow spots and lines are autofluorescent collagen fibers. Scale bars: 10 μm; original magnification, ×600.

Tumors are composed of populations of cancer cells with distinct properties. To obtain a quantitative measure of genetic heterogeneity in distinct tumor cell populations, we recorded copy number data for both BAC and centromeric probes in 100 individual CD24+ or CD44+ tumor cells in both invasive and in situ areas (a total of 400 individual cancer cells/tumor were evaluated) (Supplemental Table 4). Because we used 4-μm sections for iFISH (cutting nuclei approximately in half), it is possible that some chromosomal regions may not be well represented in the section and thus would not be detected by FISH. However, this sampling bias is expected to be the same in all cell populations analyzed. To ensure that the observed tumor cell diversity was not due to technical variability stemming from FISH procedures, we also determined BAC and centromeric probe counts in 100 normal stromal cells adjacent to tumors on the same slide for each probe as control (Supplemental Table 5 and Supplemental Figure 4).

Copy number ratios of the 8q24 BAC and chromosome 8 centromeric probes depicted using box plots demonstrated substantial variability across cell and tumor types (Figure 3A). Importantly, CD24+ and CD44+ tumor cells displayed discordant copy number ratios both within the same histology as well as between the in situ and invasive areas of some tumors. To further explore the distribution of copy number ratios within each cell type and tumor, we used histograms and kernel density estimates (18) (Figure 3B). The latter is a nonparametric way of estimating the probability density function of a random variable, providing a method to estimate the density function of the population from data obtained for 100 cells in each cell type, with minimal assumptions. Visualization of cancer cell population diversity using these approaches further highlighted the pronounced genetic heterogeneity within and between populations of CD24+ and CD44+ cells in the same histology as well as between in situ and invasive components. Similar observations were made using all other BAC probes (Supplemental Figures 5–10). Thus, despite the uniform expression of CD24 or CD44 in a subset of tumor cells, these 2 cell populations are genetically highly heterogeneous and as a consequence of this, they are likely to display variability for biological and functional traits including tumor-initiating potential and response to therapeutic agents.

Diversity for 8q24 copy number gain in breast tumors.Figure 3

Diversity for 8q24 copy number gain in breast tumors. In basal-like tumors, CD24+ corresponds to CD44– cells, because no CD24 expression is seen in these cases. (A) Box plots depicting the distribution of 8q24 copy number gain defined as the ratio of signal observed for the 8q24-specific and centromeric probes in 100 individual cells in each of the 4 indicated tumor cell subpopulations. The filled circle represents the mean; boxes show the 25th to 75th percentiles; the horizontal lines inside the boxes represent the median; whiskers extend to the 10th and 90th percentiles; and outlying black circles are individual data points outside the 10th and 90th percentiles. Differences are seen between cell populations and also progression stages both for median copy number gain and for range of distribution. (B) Histograms and kernel density estimates depicting the distribution of cells with the indicated copy number ratio. Differences are seen between DCIS and invasive areas of the same tumor (e.g., tumor 4 [T4]) and also between CD24+ and CD44+ cells within the same compartment (e.g., T13A).

Numerical indices of tumor cell diversity. To express the observed genetic diversity as a numerical value that can potentially be a clinically useful biomarker predicting the risk of progression or response to treatment, we applied diversity measures from the ecology and evolution sciences (19) to our copy number data. These diversity measures estimate the number and distribution of species in a certain geographical area or environmental niche. In our context, a species is a cancer cell population defined by a unique value of the iFISH measurement specifying the ratio of gene-specific BAC and centromeric probes. Hence, a region of a tumor containing cancer cells with 3 different copy number ratios is interpreted to contain 3 distinct “species.” We used the Shannon index as a measure of diversity

(Equation 1)

where pi is the frequency of species i in the tumor sample. This index is borrowed from information theory, where it specifies the information content of a message, and can be used to summarize the diversity of a population by a single number. An alternative measure of diversity, Simpson’s index, was also used. For discussion of the relative advantages of Shannon index versus Simpson’s index, see Methods.

We calculated the Shannon diversity index for 8q24 copy number gain in both CD24+ and CD44+ cell populations in the invasive and in situ areas of each tumor. A scatter plot of these Shannon indices suggested 2 distinct diversity groups (Figure 4A). We exploited this pattern by identifying 2 clusters for each probe and then testing whether these clusters are distinct by using the parametric bootstrap method (20) (Table 1). In the HER2+ and luminal A tumor subtypes, the diversity of 8q24 copy number as measured by the Shannon index fell into 2 significantly distinct groups; cell populations in one group had a lower diversity index than those in the other group. Whereas the 2 groups were equally large for luminal A tumors, the group with lower diversity contained fewer samples in HER2+ tumors. Interestingly, the Shannon index of the low-diversity group of luminal A tumors was essentially the same as that of normal cells (Figure 4A and Supplemental Figure 4). Basal-like tumors formed a single group with diversity measures similar to the group with high diversity in the other 2 subtypes (Figure 4A). To ensure that the observed tumor cell diversity was not due to technical issues associated with iFISH, we also defined the diversity indices of normal stromal cells adjacent to tumors on the same slide and found low and non-variable diversity for each of the chromosomal regions analyzed (Supplemental Figure 4). These data suggest that the Shannon index might be used as a clinically useful biomarker that further refines breast tumor subtypes according to their diversity.

Diversity for 8q24 copy number gain in breast tumors defined by Shannon indFigure 4

Diversity for 8q24 copy number gain in breast tumors defined by Shannon index and Whittaker plots. In basal-like tumors, CD24+ corresponds to CD44– cells, because no CD24 expression is seen in these cases. (A) The Shannon index, H, indicating diversity within tumor cell subpopulations and tumors. For each tumor, 100 different cells for each of the 4 different types (IDC CD24+, IDC CD44+, DCIS CD24+, and DCIS CD44+) were analyzed, and their Shannon indices are depicted in dark blue, dark red, light blue, and light red, respectively. Higher score indicates higher diversity. Basal-like tumors are all uniformly highly diverse for 8q24, whereas a subset of HER2+ and luminal A tumors show a lower degree of diversity. (B) Whittaker plots (rank-abundance plots) depicting the abundance of unique cancer cells.

Table 1

Cluster analysis of diversity indices

In most tumors, the 4 distinct cell populations (i.e., CD24+ and CD44+ cells in DCIS and invasive areas) had similar diversity scores for each of the 3 BAC probes analyzed; however, in some cases, CD24+ and CD44+ cells displayed divergent scores in the same histology or between DCIS and invasive regions (Figure 4A and Supplemental Figures 5–10). Interestingly, in all but 1 tumor with deviating scores, the invasive areas (both CD24+ and CD44+ cells) showed higher diversity potentially due to the larger number of tumor cells in invasive compared with in situ tumors and their exposure to more varied environmental conditions (e.g., interaction with various stromal cells that cannot occur in DCIS, because the stroma and tumor epithelial cells are physically separated from each other by the myoepithelial cell layer and the basement membrane).

To further define the abundance of unique cancer cells in the tumor samples, we used rank-abundance plots (also called Whittaker plots) (21) as graphical measures of diversity (Figure 4B and Supplemental Figures 5–10). In these graphs, species are plotted in sequence from the most to least abundant on the horizontal axis, and their frequencies are indicated on the vertical axis; hence, a steep slope corresponds to a population dominated by a few abundant species. These data again indicated that luminal A tumors are composed of a few dominant cancer cell populations, whereas basal-like and HER2+ cases more frequently contain a wider array of less abundant tumor cell types.

Associations between diversity indices and histopathologic characteristics of tumors. To further investigate differences in diversity among breast tumor subtypes, the distribution of the Shannon index for 8q24 copy number gain was explored by a heatmap (Figure 5A) and pairwise scatter plots (Supplemental Figure 11). The heatmap suggested that luminal A and basal-like tumors were mostly characterized by higher diversity in the invasive and in situ components, respectively, whereas the HER2+ subtype was not uniquely characterized by either category. CD44+ cells were more diverse within invasive relative to in situ areas of the tumors, whereas CD24+ cells showed higher diversity in in situ compared with invasive components. These observations were confirmed by the dendrogram depicting the hierarchical clustering of the tumors. The dendrogram displaying the clustering of the diversity of distinct cell populations and invasive and DCIS areas across tumors revealed a strong cluster of the areas (i.e., invasive and DCIS) and weaker subclusters of the cell populations (i.e., CD24+ and CD44+ cells). Pairwise scatter plots did not reveal any further associations that were not obvious from the heatmap and dendrograms (Supplemental Figure 11).

Diversity for different chromosomal probes in breast tumor subtypes and theFigure 5

Diversity for different chromosomal probes in breast tumor subtypes and their association with histopathologic features. (A) Hierarchical clustering of tumor samples based on the Shannon index for the 8q24 probe. Heatmap and dendrograms displaying relatedness of cell types and tumor samples based on their Shannon indices. Red and yellow correspond to high and low diversity, respectively, whereas white represents median levels. Tumor names are colored according to subtype: red, basal-like; pink, HER2+; and blue, luminal A. The color key indicates the correlation between diversity and colors. (B) Differences in diversity for different chromosomal regions in the same tumor. Histograms of copy number ratios in 4 distinct cell types for 3 different chromosomal probes are depicted in 2 individual tumors.

Next, we used a hierarchical model with the copy number ratio as the outcome; tumor subtype, histology, and cell type as the covariates; and the tumor as a random effect (see Methods and Supplemental Table 6). This model allowed us to determine the joint effects of the covariates on the distribution of the copy number ratio. We identified significant differences between CD44– and CD44+ cells in the DCIS portion of basal-like tumors (P = 0.001) for 8q24 copy number gain and between CD24+ and CD44+ cells in the invasive ductal carcinoma (IDC) portion of HER2+ tumors (P = 0.002) for 1q32 copy number gain. These differences might indicate the divergent evolution of the 2 distinct cell populations at different stages of tumor progression.

We found that tumor cell populations differed not only in copy number gain and diversity with regard to a single probe across CD24+ and CD44+ cell populations in invasive and in situ areas and tumor subtypes (Figures 3 and 4) but also with regard to different probes in a single tumor (Figure 5B). The relative presence and diversity of copy number gains for different chromosomes may be used for mapping the evolutionary history of tumors. Interestingly, in a HER2+ tumor, the abundance of tumor cells with 8q24 copy number gain was lower in invasive compared with DCIS areas, whereas the opposite was observed for 17q21/ERBB2 gain (Figure 5B).

To determine whether the diversity of each tumor with regard to 8q24 copy number gain correlates with histopathologic features of the tumors (e.g., tumor grade, nuclear pleomorphism, extent of intra- and peri-tumoral DCIS, necrosis, and proliferation rate), we analyzed associations using a rank-sum test (Supplemental Table 7). We found that several variables were highly associated (Supplemental Table 7). The most significant correlations (Table 2) were detected between DCIS CD24+ cell diversity as measured by the Shannon index and the extent of intra- and peritumoral DCIS, DCIS necrosis, and extensive intraductal component (EIC). These findings imply that larger tumor cell population size and hypoxia might increase intratumor genetic diversity. Once the P values were corrected for multiple testing, none of the associations were statistically significant due to small sample size. Hence, the associations can only be interpreted as suggestive evidence, and confirmation in a larger sample set is needed.

Table 2

Associations between diversity and histopathologic variables