Aligned Collagen Is a Prognostic Signature for Survival in Human Breast Carcinoma (original) (raw)

Abstract

Evidence for the potent influence of stromal organization and function on invasion and metastasis of breast tumors is ever growing. We have performed a rigorous examination of the relationship of a tumor-associated collagen signature-3 (TACS-3) to the long-term survival rate of human patients. TACS-3 is characterized by bundles of straightened and aligned collagen fibers that are oriented perpendicular to the tumor boundary. An evaluation of TACS-3 was performed in biopsied tissue sections from 196 patients by second harmonic generation imaging of the backscattered signal generated by collagen. Univariate analysis of a Cox proportional hazard model demonstrated that the presence of TACS-3 was associated with poor disease-specific and disease-free survival, resulting in hazard ratios between 3.0 and 3.9. Furthermore, TACS-3 was confirmed to be an independent prognostic indicator regardless of tumor grade and size, estrogen or progesterone receptor status, human epidermal growth factor receptor-2 status, node status, and tumor subtype. Interestingly, TACS-3 was positively correlated to expression of stromal syndecan-1, a receptor for several extracellular matrix proteins including collagens. Because of the strong statistical evidence for poor survival in patients with TACS, and because the assessment can be performed in routine histopathological samples imaged via second harmonic generation or using picrosirius, we propose that quantifying collagen alignment is a viable, novel paradigm for the prediction of human breast cancer survival.


See related Commentary on page 966

Despite many advances in the diagnosis and staging of human breast carcinomas, there continues to be patients for whom outcome is not easily predicted with current biomarkers. Thus, there has been a quest to discover new biomarkers, particularly those that are readily analyzed because these can potentially enhance pathological assessment. Recently, there have been several exciting, new methodologies developed and applied in the field of light microscopy that have the potential to make significant contributions along those lines.1,2 For example, our group and others have observed that standard, unstained histopathology slides processed from mouse mammary tumors contain preserved endogenous fluorescent molecules3,4 that could prospectively serve as biomarkers for tumor progression. However, to date these approaches have not been implemented in clinical studies of human patients.

Increased mammographic density is one of the greatest risk factors for the development of breast cancer,5 representing a two- to six fold increase in tumor susceptibility among women with dense breasts. The increased density is due largely to an elevated collagen concentration,6 and is commonly identified in the mammogram as a general increase in X-ray absorbance throughout the entire breast. This precondition is distinct from the events subsequent to breast tumor formation, where there is an associated stromal response termed the desmoplastic reaction that is characterized by amplified collagen matrix deposition and stromal cell recruitment and activation, thereby promoting tumor progression.7,8 Because both increased cell numbers and increased collagen are sources of contrast within the mammogram, they are difficult to distinguish, and traditionally clinical proven methods such as radiography and ultrasound imaging do not have the resolution to distinguish the tumor from collagen at the cellular level. This is particularly significant when considering that invasion of cells away from a tumor occurs through the collagen-rich stroma.9–12 Indeed, there are several events that occur at the tumor-stroma boundary that are crucial for tumor progression, including the breakdown of the basement membrane surrounding the mammary epithelium, the deposition and reorganization of the stromal matrix, the recruitment of additional stromal cells, and the invasion of tumor cells into the stroma.13–15 Therefore, techniques that identify and characterize features of the epithelial-stromal interaction at the single cell level are of great diagnostic potential.

In mice, a procession of changes with respect to collagen has been observed and classified as markers of mammary carcinoma progression, termed tumor-associated collagen signatures (TACS).12,16 Using mouse models that recapitulate the histological progression of human breast cancer,17,18 mammary tumors exhibit a localized increase in the deposition of collagen near the tumor lesion (termed TACS-1) that occurs very early in tumor formation. As tumors increase in size, a straightening of collagen fibers that are aligned parallel to the tumor boundary is noted (TACS-212). Remodeling of the stroma progresses to the final stage, which is the reorientation of collagen such that multiple collagen fibers are bundled and aligned perpendicular to the tumor boundary (termed TACS-312). The result of collagen fiber alignment is significant, as our group has shown that regions containing TACS-3 correspond to sites of focal invasion into the stroma,12,19 and we and others have observed that tumor cells preferentially invade along straightened, aligned collagen fibers, which can promote intravasation.12,20–22

Because of recent technological advances, discriminate detection of collagen can now be achieved through the use of second harmonic generation (SHG) imaging, where two photons of incident light interact with the noncentrosymmetric structure of collagen fibers such that the resulting photons are half the wavelength of the incident photons.23 This nonlinear coherent process is nonfluorescent and will specifically image collagen, where the backscattered SHG signal is easily separated from any fluorescence that may be occurring through the use of narrow bandpass filters centered at one-half the laser wavelength. Thus, no labeling or staining is required, and the resulting image can be used in lieu of stains for collagen, such as Masson's trichrome or picrosirius red stain.12 This emerging technology generates contrast in an image based solely on the presence of collagen, and is influenced by the properties of collagen itself such as the degree of cross linking, fiber thickness, and alignment of overlapping fibers (ordered versus disordered).24–26 Furthermore, SHG imaging has been used to characterize diagnostic features of uterine, skin, bone, and ovarian cancers, and in studies of cardiomyopathies, using human tissue and/or animal models.27–30 Through the use of SHG, we set out to determine whether TACS-3 is present in clinical histopathology samples from patients diagnosed with various stages of breast cancer. We find that TACS-3 is robustly imaged in human tissue sections and is correlated with patient survival, suggesting that TACS is a potential biomarker for predicting tumor progression in patients.

Materials and Methods

Human Breast Carcinoma Tissue Microarray

A tissue microarray, which contains tumor tissue cores from 207 breast cancer patients, was used for the analysis. After approval from the institutional review board, archival breast carcinoma samples from 353 patients were screened for suitability for this study.31 Surgery on all patients had been performed by one surgeon between 1981 and 1995. There remained 207 cases in the study after tumors smaller than 5 mm and low-quality paraffin blocks were excluded. Some samples were not of suitable quality to analyze due to sectioning artifacts or damage, and thus our data are from 196 samples within the tissue microarray. The histological grade was determined after the Elston-Ellis modification of the Bloom Richardson method.32 Paraffin-embedded samples were sectioned at 4-μm thickness. The tissue array was assembled with a manual tissue arrayer (MTA-1; Beecher Instruments, Sun Prairie, WI) equipped with a 1.0-mm punch needle. This tissue microarray had been previously characterized with respect to patient age, tumor size, histological subtype, tumor grade, lymph node status, estrogen receptor (ER), progesterone receptor expression, human epidermal growth factor receptor-2 (HER-2) overexpression, and Ki-67 proliferation index33,34 (see Supplemental Table S1 at http://ajp.amjpathol.org). The median follow-up time was 6.2 years, with a range between 1 month and 18.6 years.

Multiphoton SHG Microscopy

For all imaging, a custom multiphoton workstation at the University of Wisconsin Laboratory for Optical and Computational Instrumentation (LOCI) was used.2,35 The tissue array slides were imaged using a TE300 inverted microscope (Nikon, Tokyo, Japan) equipped with a CFI Plan Fluor ×10 (N.A. = 0.3; Nikon) objective lens by using a mode-locked Ti:Sapphire laser (Tsunami; Spectra Physics, Mountain View, CA) pumped by a 5-W, solid-state laser (Millenia; Spectra Physics) to generate pulse widths of approximately 120 femtoseconds at a repetition rate of 76 MHz. Tuning the excitation wavelength to 890 nm, a 445 nm ± 0.5 nm narrow bandpass emission filter (Thin Film Imaging, Greenfield, MA) was used to detect the SHG signal of collagen in the backscattered mode using a H7422P GaAsP photon counting PMT (Hamamatsu Photonics, Hamamatsu City, Japan). SHG signal is generated when two photons of incident light interact with the noncentrosymmetric structure of collagen fibers, such that the resulting photons are half the wavelength of the incident photons. Autofluorescent images were acquired without filtering. Images of 1024 × 1024 pixels were acquired using WiscScan (LOCI, University of Wisconsin, Madison, WI) (http://www.loci.wisc.edu/software/wiscscan/) under identical conditions and laser power. Although we imaged exclusively unstained slides here, a similar SHG image could be generated from H&E-stained slides, as the fluorescence from eosin cannot pass through our narrow bandpass emission filter based on previous data.

SHG Intensity Analysis

SHG images were opened in ImageJ (version 1.39o, National Institutes of Health, Bethesda, MD) where the 8-bit image (0 to 255 gray levels) data were at a threshold between values of 10 and 255 to separate SHG signal from shot noise and the dark current of the detector during subsequent analysis. First, a circular region of interest (ROI) 100 pixels in radius (pixel size, 1.29 μm/pixel) was drawn and positioned in the image. The coordinates of the center of the circle and its dimensions were saved as an ROI file in ImageJ (National Institutes of Health). This positioning was repeated so that 14 nonoverlapping ROIs were generated and saved. This array of ROIs was used together to assess the entire SHG image on an ROI basis in a consistent manner and objectively for each sample in the array. Because each sample could be of a slightly different size, the ROIs do not always extend out to the edge of the tissue slice and there were also small gaps between the ROIs. Using the multi-measure plugin, the 14 ROI files were opened and thus visualized as an overlay of the image, and then the plugin allows separating standard measurements from each of the thresholded ROIs, where the data from each numbered ROI is then displayed in the results dialog box. This data was analyzed using SigmaPlot (SPSS Science Inc., Chicago, IL) for two quantifiers of SHG signal: integrated density (the summation of the pixel intensity values for all of the pixels in the ROI, which is essentially a measure of the total brightness), and the area fraction (the percentage of the ROI area that is above the SHG threshold and is essentially a measure of SHG prevalence).

TACS Analysis

The SHG image was overlaid with the array of 14 ROIs and shared amongst a panel of three independent reviewers who were blind to the patient outcome. Each panelist then rated each of the 14 ROIs for TACS-3 presence for a total of 42 ratings in each image. The criteria for a “yes” or “no” TACS-3 rating was taken from established definitions of collagen organization12 where bundles of straightened, aligned collagen fibers oriented perpendicular to the tumor boundary were defined as being TACS-3 positive. Therefore, the primary aspect that was assessed by the panelists was the orientation of the collagen and not the intensity of SHG signal. To confirm that the collagen fibers visualized in the SHG image are truly abutting tumor epithelium, the corresponding digital camera image of the H&E-stained serial section of the array was simultaneously viewed. Thus, areas containing fibroblasts, lymph nodes, muscle, fat, or blank space could be excluded from being false positives. Scores of TACS-3 prevalence were constructed as described in the Results section and figure legends. Agreement between panelists was determined by statistical analysis of the kappa scores (Table 1) and indicated moderate agreement between panelists and no difference based on tumor subtype.

Table 1.

Kappa Score Analysis by Tumor Subtype

Diagnosis Kappa 95% low CI 95% upper CI
Ductal 0.50510 0.44212 0.56808
Lobular 0.49367 0.32302 0.66432
Tubular 0.50187 0.33599 0.66775

Statistical Analysis

Patient characteristics were summarized using standard descriptive statistics. Continuous variables were summarized in terms of medians and ranges. Frequencies and percentages were used to summarize categorical variables. The Fleiss' kappa statistic was computed to evaluate the inter-rater reliability of agreement between the three panelists. The 95% confidence intervals of the kappa indices were computed using the normal approximation method. A kappa statistic below 0.4 is considered fair to poor, and values between 0.4 and 0.6 are considered moderate.

Disease-specific survival (DSS) was defined as the time from diagnosis to either death caused by breast cancer or last follow-up evaluation. Data from patients who died of causes other than breast cancer were censored in the survival analyses. Disease-free survival (DFS) was defined as the time from date of diagnosis until date of first recurrence. All other events were censored. The Kaplan–Meier method was used to analyze DSS and DFS. Comparisons of DSS and DFS between TACS-3–negative and TACS-3–positive patients were performed using the log-rank test. The associations between TACS-3 scores and various clinicopathologic markers were assessed via Spearman's rank correlation coefficients or polyserial correlation coefficients.

Multivariate Cox proportional hazard analysis was performed to determine the prognostic significance of TACS-3 scores and other markers for predicting DSS and DFS. Grade, size, age, estrogen receptor status, progesterone receptor status, HER-2 status, node status, Ki-67 H-score, syndecan-1 H-score, carcinoma cell syndecan-1 H-score, and carcinoma cell syndecan-4 H-score, and TACS-3 scores were included as covariates in the saturated model. A backward selection procedure with a P value cut-off of <0.05 was used to determine parsimonious models. The likelihood ratio test was used to compare various models. The proportional hazard assumption was verified using plots of the log(−log) survival curves and Schoenfeld residuals. All statistical tests were two sided, and P values <0.05 were considered significant.

The classification and regression tree (CART) methodology for survival data was used to construct a decision tree for predicting DSS.36–38 The CART algorithm calculates optimal threshold values for continuous variables to categorize subjects into a low- or high-risk group. The CART approach toward classifying cases is based on recursive partitioning of the data and is particularly well suited for identifying complex interactions among variables that are predictive of usually high or low survival risk. The CART algorithm selects the best predictor variables using recursive splitting. It starts with the best possible predictor from the data set and successively splits the data into categories predicted to observe the event or not. The 10-year DSS rate was used as the outcome variable in the decision tree analysis. CART attempts to maximize the purity of each split, striving to accurately categorize cases into the appropriate outcome grouping. Subsequent partitioning of the data follows this same method, using other predictor variables to guide the classification accuracy or purity of the final tree. As a splitting method, the exponential scaling method was used. The splitting process stopped when a minimum of 5 patients per group was reached or when there was no further decrease in prediction error. Cross-validation studies were performed to compare the predictive power levels of various decision trees. The results of the decision tree with the highest predictive power were presented. Statistical data analyses were performed using SAS statistical software (version 9.2; SAS Institute Inc, Cary, NC) and R software version 2.9.2 (R Foundation for Statistical Computing, Vienna, Austria).

Results

To assess collagen organization from breast cancer patients, we made use of a tissue array, created from archived samples from 207 patients, which has substantial follow-up data. This tissue microarray has previously been characterized by immunohistochemistry for levels of syndecan-1, syndecan-4, Ki-67, E-cadherin, estrogen receptor, progesterone receptor, HER-2, and neutrophil gelatinase-associated lipocalin.33,34 The clinical profile of the patients included in this array were previously described33 (see Supplemental Table S1 at http://ajp.amjpathol.org). Of the original 207 patient samples, 196 samples were of sufficient quality for analysis, the remainder being damaged in processing and sectioning. As can be seen in a subsampling of representative H&E samples from the array (Figure 1, A and B), there was a wide range in the histopathological findings. In these 1-mm-diameter core sections, fat cells were nearly absent, with varying amounts of stromal fibroblasts, infiltrating lymphocytes, and macrophages. Examination of the relationship between tumor cells and the extracellular matrix in these samples identified a broad array of architectures ranging from well-defined borders surrounding epithelial clusters in the tissue section (Figure 1A) to samples where stromal extracellular matrix was observed to intercalate between individual epithelial cells and interact with regions displaying poor tumor margins (Figure 1B).

Figure 1.

Figure 1

Second harmonic generation imaging of human breast cancer biopsies. A and B: Two representative examples of H&E-stained slices that demonstrated varying histopathological findings and tumor grade. C and D: The corresponding second harmonic generation (SHG) images illustrate the complexity and variability of collagen localization in these samples. E–H: Insets in A and B show that endogenous fluorescence intensity from cells and stroma was preserved in histopathology slides where the corresponding SHG image can either be overlaid with the fluorescence image (E, F) or examined by itself (G, H) to visualize the relationship of collagen to cells. Scale bars (A–D) = 200 μm, and (E–H) = 50 μm.

The relative concentration of collagen and the orientation of fibers with respect to epithelial cells was assessed using SHG imaging.23,39 As can be seen in Figure 1, A and C, some samples were observed by H&E to contain a great deal of collagen in the stroma, yet the SHG image was very dim, indicating that the loose, fibrillar collagen structures were not strong generators of SHG signal in fixed-tissue sections. In contrast, other samples (Figure 1, B and D) generated strong SHG signal even from small, thin strings of collagen fibers. This difference represents the unique aspect of SHG compared to more classic collagen staining approaches, in that SHG is dependent on both the amount and structure of collagen fibrils.24–26,40–42 Our data fit very well with the concept that highly ordered collagen structures are excellent generators of second harmonic signals, whereas disorganized structures produce less signal.25,26,41

Total fluorescence intensity data acquired from multiphoton excitation of endogenous fluorescence in unstained tissue sections4 were digitally overlaid with the SHG signal from collagen (Figure 1, E and F) to facilitate visualization of the spatial interaction between collagen and autofluorescent carcinoma cells. Alternatively, the SHG image was analyzed, independent of autofluorescent signals (Figure 1, G and H). The tumor/stromal boundary had a variety of architectures (Figure 1, E–H). For example, the left panels of Figure 1 are from a tumor that had relatively well-defined borders between carcinoma cells and stromal collagen and contain collagen fibers that run parallel to the epithelium, similar to what is observed around normal ducts in nondiseased tissue.12 In contrast, images on the right side of Figure 1 demonstrate epithelial cells and collagen that are highly mixed, along with stromal cells, such as fibroblasts, where bundles of collagen fibers that are relatively straight and aligned perpendicular to the tumor boundary (ie, TACS-3) were observed (Figure 1H).

Consistent with our initial report of TACS-3 in mouse mammary carcinomas,12 TACS-3 was not observed around the entire perimeter of each tumor region, but rather locally. Concordant with this observation, localized sites of invasion are often noted in histopathology. Because invasion and TACS-3 are co-localized in mouse mammary tumors, we predicted that the abundance of regions with local TACS-3 may correlate to invasive potential in human patients. Therefore, the frequency of TACS-3 was quantified as outlined in Figure 2. Data were analyzed from 196 patients for a statistical pool of 2744 ROI TACS-3 determinations. A typical ROI that was rated as “yes” for TACS-3 is shown in Figure 2B (ROI 9), where in the upper ROI, straightened bundles of collagen fibers terminate at a boundary with epithelial cells (as determined in the H&E image of the same region). A single ROI could potentially contain numerous TACS-3 events, but was simply rated as a single “yes” vote. Three different statistical scores were developed to determine how much TACS-3 presence, if any at all, leads to poor patient survival. Therefore, the three scores varied in their stringency such that score 3 represented any TACS-3 in the entire sample, where scores 1 and 2 account for the amount of TACS-3 present. Because we examined one single, thin tissue section per tumor, we are likely underestimating the amount of TACS-3 present. However, sampling a small portion of the entire tumor is practically relevant, as sampling in a clinical setting is usually also a small fraction of the whole tumor.

Figure 2.

Figure 2

Region of interest (ROI) analysis of tumor-associated collagen signature-3 (TACS-3) presence. A: The second harmonic generation (SHG) image of each entire tissue sample was overlaid with 14 ROIs, which were positioned and numbered reproducibly. Image intensity statistics were measured from each individual ROI independently. B: For TACS-3 analysis, each ROI was rated by each of three panelists for TACS-3 presence at ×3 zoom. A typical example of an ROI that would be rated “yes” is shown in ROIs 7, 9, and 12 (A). In ROI 9, bundles of straightened, aligned collagen fibers in the SHG image were observed to terminate at a region of negative space, which was confirmed by examination of the H&E image to be epithelial cells. The definitions for the TACS-3 evaluation scores are: Score 1: For each ROI, TACS-3 was classified as “present” if at least two panelists rated “yes.” The percentages of ROI (across all 14 measurements) rated as “present” were computed for each patient. Therefore, this score is a percentage measure of the number of TACS-3 events in a slice. A receiver operating curve (ROC) analysis was performed to determine “optimal” threshold values for classifying patients as either TACS-3 positive or TACS-3 negative. Specifically, the threshold value was determined by maximizing the positive likelihood ratio [sensitivity/(1 – specificity)] for predicting the 5-year disease-specific survival status. The threshold used here for score 1 was 0.1. Score 2: For each ROI, panelists' ratings of “no” were scored as 0 and “yes” were scored as 1. An average of the total number of “yes” ratings per ROI was computed across all 14 ROIs. Because there were three panelists, score 2 ranged in value between 0 and 3 for each patient. As for score 1 previously, an ROC analysis was performed to determine a threshold for splitting data as TACS-3 positive or TACS-3 negative. This threshold was determined to be 0.5. Score 3: For each ROI, TACS-3 was classified as “present” if at least two panelists rated “yes.” The patient sample was classified as “positive” for TACS-3 if at least one ROI was rated as “present” (ie, at least two panelists rated “yes”). Therefore, this is the least stringent of the three definitions where we determine whether any ROI with TACS-3 is present in the entire sample.

A Cox proportional hazard model was used to evaluate whether the presence of TACS-3 predicts survival. As can be seen in the univariate analysis displayed in Table 2, each of the three TACS-3 scores were associated with decreased disease-specific survival (DSS) and disease-free survival (DFS). The hazard ratio was slightly higher for the assessment of DSS (DSS = 3.34) versus DFS (DFS = 3.04), although both scores 1 and 2 carried a hazard ratio greater than 3.0. Furthermore, as the stringency of the evaluation of TACS presence is increased (as from score 3 to 2 to 1), the significance improves. Therefore, score 1, which is contingent on both a high prevalence of TACS-3 and good agreement between panelists, was the best predictor of survival based on the fact that it had the lowest P values of the three scores. Kaplan-Meier curves of the univariate analysis demonstrate the survival of patients evaluated as TACS-3 positive versus those rated as negative (Figure 3). In the Kaplan-Meier analysis, score 2 presented the most predictive value in the first 5 years of follow-up, whereas all three scores were predictive of survival with time. This is significant, as it suggests that the presence of even a small region of TACS-3, as assessed in score 3, carries a risk of long-term relapse.

Table 2.

Univariate Analysis

Outcome E N Score Hazard ratio Lower CL Upper CL P value Int. dens. % area
DFS 58 195 Score 1 3.04 1.19 7.76 0.0200 0.0217 0.0167
DFS 58 195 Score 2 3.18 1.11 9.17 0.0320 0.0340 0.0256
DFS 58 195 Score 3 1.83 1.01 3.29 0.0447 0.0365 0.0286
DSS 52 191 Score 1 3.34 1.24 9.02 0.0172 0.0176 0.0166
DSS 52 191 Score 2 3.46 1.12 10.72 0.0315 0.0321 0.0301
DSS 52 191 Score 3 1.89 1.01 3.55 0.0479 0.0453 0.0439

Figure 3.

Figure 3

Tumor-associated collagen signature-3 (TACS-3) presence is correlated with poor survival. A–C: The Kaplan-Meier curves for all three TACS-3 scores demonstrated that both the disease-free survival and the disease-specific survival rates of patients rated as TACS-3 positive were significantly worse with time than those patients rated as TACS-3 negative. For score 1, the number of patients rated as TACS-3 negative (black lines) was n = 98, TACS-3 positive (red lines) n = 97. Score 2 TACS-3 negative n = 168, TACS-3 positive n = 27; score 3 TACS-3 negative n = 72, TACS-3 positive n = 123.

The analysis of TACS evaluation scores was augmented to include pixel intensity data measured from the SHG image (Table 2). This was performed using the same ROI partitioning, where images were at a threshold to remove background noise. Two measurements were made, the first was the integrated density that sums the pixel values for the entire ROI and is therefore a measure of the total SHG intensity in that ROI. The second measurement was to calculate the percentage area, the fraction of pixels within the ROI that were above the threshold, and is thus a measure of how prevalent collagen was in that ROI. When either one of these measurements were used in conjunction with the TACS evaluation scores in a multivariate Cox proportional hazard model adjusted for intensity, there was a slight decrease in the P values for many of the TACS scores for both DSS and DFS (Table 2). Thus, the addition of the consideration of collagen concentration, and not just orientation, improves the predictive ability of SHG imaging. However, the improvement was minor, and because the scores were already highly predictive, and because univariate analysis of the intensity data alone without consideration of TACS was not significant, it was concluded that this data is not requisite for analysis.

The precise mechanism by which collagen fibers are realigned is unknown, but could potentially be regulated by processes already known to be predictive of patient outcome such as HER-2 status. To assess whether TACS evaluation scores are independent biomarkers for DSS and DFS, we performed multivariate Cox proportional hazard analysis. In the original model, the following variables were included: tumor grade, tumor size, patient age, estrogen receptor (ER) status, progesterone receptor status, HER-2 status, node status, Ki-67_H-score, Syndecan-1 H-score, carcinoma cell Syndecan-1 H-score, and carcinoma cell Syndecan-4 H-score. Each variable was iteratively analyzed with each of the TACS-3 scores (score 1, 2, or 3). Syndecan-1 and -4 were included as these proteoglycans bind to collagen, help organize collagen matrices, and were previously demonstrated to have predictive value for human breast carcinoma outcome.33 Predictive variables were selected using a backward selection procedure. Summarized in Table 3, after step-wise regression, the variables that remained as significant predictors of survival are listed in the “variable” column. The results indicate that TACS-3 is an independent prognostic marker for both disease-specific and disease-free survival, as are progesterone receptor, ER, node status, and tumor size. Other variables not listed in this column were not independent prognostic markers in this study. Interestingly, although TACS-3 is an independent marker that can by itself predict survival, there was a positive correlation between TACS-3 and the stromal Syndecan-1 H-score of 0.3 (0.32 for correlation between Sdc-1 and score 1, 0.34 for score 2, and 0.28 for score 3, P < 0.0001 for all). TACS-3 was not significantly correlated to any of the other markers used in this analysis.

Table 3.

Multivariate Analysis

Outcome E N Score Variable Hazard ratio HR lower CL HR upper CL P value
DFS 58 195 Score 1 Score 1 4.79 1.89 12.13 0.0009
DFS 58 195 Score 1 Grade 1.56 1.03 2.36 0.0355
DFS 58 195 Score 1 Size 1.15 1.02 1.30 0.0224
DFS 58 195 Score 1 PR_pos 0.51 0.28 0.92 0.0254
DFS 58 195 Score 1 Nodes 1.06 1.01 1.10 0.0085
DFS 58 195 Score 2 Score 2 5.52 1.93 15.79 0.0014
DFS 58 195 Score 2 Grade 1.56 1.03 2.36 0.0370
DFS 58 195 Score 2 Size 1.15 1.15 1.30 0.0187
DFS 58 195 Score 2 PR_pos 0.50 0.27 0.91 0.0226
DFS 58 195 Score 2 Nodes 1.06 1.06 1.10 0.0095
DFS 58 195 Score 3 Score 3 2.43 1.33 4.45 0.0038
DFS 58 195 Score 3 Grade 2.12 1.45 3.12 0.0001
DFS 58 195 Score 3 Size 1.22 1.11 1.35 0.0001
DSS 52 191 Score 1 Score 1 5.63 2.18 14.54 0.0004
DSS 52 191 Score 1 ER_pos 0.22 0.12 0.39 <0.0001
DSS 52 191 Score 1 Size 1.20 1.07 1.34 0.0021
DSS 52 191 Score 1 Nodes 1.05 1.01 1.10 0.0099
DSS 52 191 Score 2 Score 2 6.40 2.13 19.21 0.0009
DSS 52 191 Score 2 ER_pos 0.22 0.12 0.39 <0.0000
DSS 52 191 Score 2 Size 1.20 1.07 1.34 0.0020
DSS 52 191 Score 2 Nodes 1.05 1.01 1.10 0.0111
DSS 52 191 Score 3 Score 3 5.63 2.18 14.54 0.0004
DSS 52 191 Score 3 ER_pos 0.22 0.12 0.39 <0.0001
DSS 52 191 Score 3 Size 1.20 1.07 1.34 0.0021
DSS 52 191 Score 3 Nodes 1.05 1.01 1.10 0.0099

To determine whether certain types of tumors were more amenable to TACS analysis, we partitioned our analysis to assess the fidelity of TACS-3 scoring based on breast tumor subtype (Table 4). We found that scores 1 and 2 are good predictors of both disease-specific and disease free survival when tumors arise from either ductal or lobular locations; thus the presence of TACS-3 was not dependent on the location of the tumor.

Table 4.

Univariate Analysis by Tumor Subtype

Outcome E N Score Hazard ratio HR lower CL HR upper CL P value
Diagnosis: ductal
DFS 51 161 Score 1 3.34 1.17 9.49 0.0238
DFS 51 161 Score 2 3.55 1.09 11.60 0.0356
DFS 51 161 Score 3 1.84 0.98 3.46 0.0570
DSS 45 158 Score 1 3.45 1.15 10.39 0.0277
DSS 45 158 Score 2 3.79 1.08 13.31 0.0378
DSS 45 158 Score 3 1.99 1.01 3.94 0.0047
Diagnosis: lobular
DFS 7 20 Score 1 13.23 1.06 165.40 0.0450
DFS 7 20 Score 2 13.37 0.71 250.90 0.0830
DFS 7 20 Score 3 1.96 0.36 10.72 0.4379
DSS 6 19 Score 1 16.20 1.34 195.20 0.0285
DSS 6 19 Score 2 16.60 0.91 306.40 0.0584
DSS 6 19 Score 3 3.86 0.43 34.71 0.2288

These findings suggest there is great potential for the examination of collagen organization as a new diagnostic procedure because of its predictive capacity. Therefore, we used the CART methodology for predicting 10-year disease-specific survival (Figure 4). Decision tree construction starts with the best possible predictor from the data set (in this case tumor size) and successively splits the data into categories predicted to observe the event or not. Subsequent partitioning of the data follows this same method, where we used the other independent predictors (estrogen receptor status and TACS-3 scores) to guide the classification accuracy or purity of the final tree. Decision trees were constructed for all three TACS-3 scores, and the decision tree for score 1 provided the best predictive decision tree in terms of sensitivity and specificity. From this tree we can predict that patients with small tumor size (<1.35 cm) or patients with estrogen receptor–positive tumors that are larger (≥1.35 cm) in size are more likely to survive 10 years or longer as long as their TACS-3 score 1 values are low (<0.04). Specifically, the predicted probability that a patient with a tumor size of <1.35 cm survives for at least 10 years is 89% (95% CI: 71% to 100%), regardless of estrogen-receptor status or TACS-3 score. The probability that a patient survives for at least 10 years with a estrogen receptor–positive tumor of size ≥1.35 cm and with TACS-3 score 1 <0.04 is 78% (95% CI: 60% to 96%); however, if the TACS-3 score 1 is greater than 0.04 then survival drops to 46% (95% CI: 31% to 67%).

Figure 4.

Figure 4

Classification and regression tree analysis of tumor-associated collagen signature-3 (TACS-3) score in predicting breast cancer survival. Using the classification and regression tree method for partitioning patient survival based on three independent predictor variables (previously derived from the multivariate analysis), the following decision tree was created. Tumor size and estrogen receptor (ER) status were better predictors of survival and were the first criteria used for splitting the data, where a final result of “yes” or “no” refers to whether or not the patient survived to 10 years. The predicted 10-year disease-specific survival probabilities for the various marker profiles are shown in parentheses with 95% confidence intervals. The value for the TACS-3 score 1 was shown to be able to split patient survival based on the threshold value of 0.04.

Discussion

Even though a handful of biomarkers for breast cancer that advise outcome and treatment have been identified, such as the presence or absence of ER, progesterone receptor, and HER-2, there remain patients who present with none of these markers (ie, triple negative breast cancers), as well as patients whose disease progresses with no clear indication that distinguishes them from patients whose disease will not progress. Thus, the discovery of new biomarkers is of obvious benefit to further refine the diagnostic process. In mouse models of mammary tumor progression, we have defined changes in collagen deposition and arrangement, termed TACS, that manifest early in tumor development and accompany progression in a predictable manner. Here we demonstrate that a tumor-associated collagen signature that facilitates invasion in mouse models, TACS-3, has the ability to predict disease recurrence and survival in human patients.

Because collagen is readily imaged in standard histopathology slides, the use of TACS-3 as a biomarker has potential for broad application in pathology as nothing has to change with the current “gold standard” histopathology process. Collagen features are stable and robust, persisting even when tissues have not been fixed in a timely manner, and these features are maintained under various tissue-processing approaches, including frozen samples in embedding medium, frozen samples that have not been embedded, and formalin-fixed paraffin-embedded tissue (authors' published12,43 and unpublished observations). In addition to the second harmonic imaging approach presented here, picrosirius red staining allows imaging of collagen structure when imaged with polarized light, particularly when circularly polarized light is used to account for fibers in all orientations. We previously validated the collagen signal with our SHG imaging of unstained slides by comparing slides stained with picrosirius red, in which one can also see the TACS-3 alignment described here, with similar results.12 The results presented here are pertinent to the collagen alignment, regardless of how the collagen is imaged as long as fibers can be discerned. Thus, picrosirius staining of collagen and TACS-3 determination could readily be used in clinical pathology. An advantage of SHG is that it can be used on live, unfixed, unstained tissue, allowing this approach to be expanded beyond pathology slides to detect TACS-3 within freshly biopsied tissue.44 Moreover, SHG imaging can be performed on unstained or H&E-stained slides, and thus requires no additional staining procedures such as picrosirius. This study also underscores the potential for the use of non-linear optical imaging techniques such as second harmonic and fluorescence lifetime imaging microscopy in the early detection of cancers.3,12,27,43,45–47 In addition to mammary carcinoma, collagen has been investigated in human ovarian cancer patients, where the malignant tissue was associated with denser collagen and fibers that were more highly ordered,27,47 suggesting that methods to image and characterize collagen changes could have broad relevance to several cancers.

Our finding that even a small region of TACS-3, which was reflected in score 3, has ability to predict outcome has a few important implications. First, this finding demonstrates that the approach is robust, and one need not observe all of the tumor/stromal boundary but rather the observation of even a single region of TACS-3 has predictive value. This is particularly relevant when one considers that oftentimes tumor/stromal boundaries are very ill defined at this advanced tumor stage. Second, it suggests that the TACS-3 signature is a fairly common occurrence within those carcinomas that are TACS-3 positive, such that even with this subsampling we were able to find good correlation to outcome. This is reassuring, as pathology is currently limited to observing only a subsample of the whole. Our data suggest that TACS-3 has predictive value for all patients. While we don't know how each patient was treated, we assume that all patients were treated with standard of care appropriate for their diagnosis at that point in time. When considering other markers in current clinical use, the presence of more TACS-3 is most useful in segregating those patients who are ER positive and have large tumors. However, because we did not have enough ER-negative patients to achieve good statistical splitting in the CART analysis, we do not yet know whether TACS-3 further segregates ER-negative patients. Our results suggest that patients positive for TACS-3 should be followed more aggressively, or may benefit from additional therapy.

Perhaps the largest risk factor for breast carcinoma is mammographic density, accounting for almost 1/3 of breast carcinomas,5,48 yet the underlying mechanisms and cell biology to account for this risk have not been fully established. An increased deposition of collagen is one of the leading causes of increased tumor density and rigidity observed in mammograms.6,49 Currently, we do not know how the TACS signature and collagen alignment relate to mammographic density. Such studies are underway, in which stereotactic biopsy based on local regions of high and low mammographic density will be analyzed. However, mammography does not provide adequate resolution to understand mechanisms of disease progression, and mammographic density by itself has been a poor predictor of distant recurrence.50 Therefore, biopsy and smaller scale analysis typically follows, but to date histopathology analysis has not included an examination of the link between collagen and patient survival.

The discovery of this potential biomarker for survival is exciting not only because of its statistical relevance but also because aligned collagen fibers are functionally linked to invasion. We find that cells preferentially invade along perpendicularly aligned collagen fibers compared to randomly organized collagen.51 Moreover, Wang et al52 elegantly presented images of tumor cells crawling along straightened collagen fibers in live mammary carcinoma, and demonstrated that cells prefer to migrate along fibers by a margin of two to one.52 Furthermore, it has been shown that invasion is severely limited in an unaligned matrix.51 In addition to effects on cell migration, an aligned matrix is likely to be stiffer, and there is solid evidence that matrix stiffness promotes tumor cell proliferation.44,53,54

The mechanism by which the matrix could be reorganized into a TACS-3 orientation is still not well understood, but it is subject to regulation via multiple pathways as it is deposited, stiffened, and aligned. Deposition of collagen and extracellular matrix accompanies tumor progression and is distinct from the increase in overall breast density that is the aforementioned risk factor.55 This process, termed desmoplasia, is a stromal response to tumor formation and is enhanced by the recruitment of fibroblasts and tumor-associated macrophages,56,57 regulated by proteoglycans and cytokine stimulation.58–61 Collagen is secreted primarily by fibroblasts. Importantly, several groups have demonstrated that tumor-associated fibroblasts differ from normal fibroblasts in their ability to promote tumor progression.62,63 Carcinoma-associated fibroblasts, but not normal mammary fibroblasts, express elevated levels of the cell-surface proteoglycan syndecan-1 in human patients, and this is associated with a poor outcome.33,59 It is interesting to note that loss of syndecan-1 represses Wnt-induced mammary tumors in mouse models,64 which suggests the effect may be due to effects on stroma as well as on countering Wnt effects on tumor precursor cells.65 Because the levels of syndecan-1 and syndecan-4 were assessed for this same patient cohort and also predict survival,33 we were able to demonstrate a positive correlation between TACS-3 and syndecan-1 expression in the stroma. This finding makes sense, as there is a functional link between syndecans and collagen deposition. Strikingly, recent evidence demonstrates that syndecan-1–positive fibroblasts deposit an aligned matrix.66

In addition to fibroblasts, it is clear that macrophages also regulate collagen fibrillogenesis and have been shown to be associated with aligned matrix in the terminal end bud of developing mammary gland.57,67 The presence of high numbers of macrophages in breast tumors is associated with high hazard ratios and poor survival.68 Moreover, macrophages initiate a paracrine growth factor signaling loop that controls the chemotactic response of carcinoma cells during invasion,14 where an aligned matrix would facilitate the invasion process.

We currently do not conclusively know if TACS-3 alignment of the matrix in vivo requires protease activity. Mammary tumors arising in mice that carry a mutation in collagen I at the collagenase site that inhibits human collagenase activity, the Col1a1 tm1Jae model, result in increased collagen density in the mammary gland and earlier TACS-3 progression corresponding to increased local tumor invasion and distant spontaneous lung metastases.16 Moreover, explanting mammary tumor epithelium into a randomly organized collagen matrix results in perpendicular alignment of the matrix in a protease-independent manner.12,22 Conversely, other investigators find that membrane-anchored proteases are required for invasion into a collagen matrix, and thus the role of proteases in this process is still unresolved.69

Menopause and increased age are associated with increased tumorigenesis and metastasis, but also reduced collagen presence, which is in apparent contradiction to the role that increased breast density plays.70,71 However, this discrepancy has been explained in terms of cumulative risk for those patients who spend a lifetime with higher breast density, even if that density diminishes as they age.72 Interestingly, pregnancy-associated breast cancer has been linked to the involution period, during which time the breast undergoes both a dramatic increase in collagen deposition and remodeling of the matrix.73,74 Although the specific contribution of collagen to pregnancy-associated breast cancer is currently unknown, it is interesting to note that as in tumor progression, macrophages facilitate the remodeling of the breast extracellular matrix during these events.75

In summary, we present a novel biomarker that is robust and significantly associated with disease outcome. We propose that collagen alignment (TACS-3) could be used as an adjunct to the histopathologic process to help inform patient diagnosis. This finding is consistent with prior observations that collagen alignment facilitates cell invasion.12,22 Thus, an increased understanding of the mechanism by which matrices are aligned, and by which cells invade along aligned matrices, has the potential to aid in the development of therapies to block this aspect of tumor progression. In particular, our results suggest that therapies directed at cellular interactions with the extracellular matrix or directed at the matrix itself to prevent collagen deposition may be useful in prevention or treatment.

Acknowledgments

We thank the members of the Keely and LOCI Laboratories for technical assistance and for discussions and comments on this article.

Footnotes

This work was supported by a grant from the Mary Kay Ash Charitable Foundation (P.J.K.), a Coulter Foundation Award (K.W.E and P.J.K.), grants from the National Institutes of Health (NIH) RO1 CA114462 to P.J.K., NIH RO1 CA142833 to P.J.K., NIH R01 EB000184 award to K.W.E., a DOD-CDMRP/BCRPW81XWH-04-1-042 award to P.P.P., and an NIH2RO1 CA107012-06 award to A.F.

Current address of P.P.P., Fred Hutchinson Cancer Research Center, Clinical Research Division, Seattle, WA.

CME Disclosure: P.J.K. is a consultant for Platypus Inc., Madison, WI. The other authors did not disclose any relevant financial relationships.

Supplementary data

Supplemental Table S1

References

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Table S1