BayesSpace (original) (raw)

Clustering at enhanced resolution

The spatialEnhance() function will enhance the resolution of the principal components, and add these PCs as well as predicted cluster labels at subspot resolution to a newSingleCellExperiment. As with our demonstration ofspatialCluster() above, we are using fewer iterations for the purpose of this example (nrep=1000) than we recommend in practice (nrep=100000 or greater). Note that thejitter_scale parameter should be tuned so that proposals for updating subspot-level expression are accepted around 30% of the time. This can be evaluated usingmcmcChain(melanoma.enhanced, "Ychange"), where the chain should stabilize to 0.25-0.40. Typically 1000-2500 iterations are sufficient to evaluate if jitter_scale should be increased if acceptance is too high or decreased if acceptance is too low. After tuning, proceed to a full run of spatialEnhance with more iterations.

melanoma.enhanced <- spatialEnhance(melanoma, q=4, platform="ST", d=7,
                                    model="t", gamma=2,
                                    jitter.prior=0.3, jitter.scale=3.5,
                                    nrep=1000, burn.in=100,
                                    save.chain=TRUE, cores = 1)

The enhanced SingleCellExperiment includes an index to the parent spot in the original sce(spot.idx), along with an index to the subspot. It adds the offsets to the original spot coordinates, and provides the enhanced cluster label (spatial.cluster).

head(colData(melanoma.enhanced))
#> DataFrame with 6 rows and 9 columns
#>              spot.idx spot.neighbors subspot.idx subspot.neighbors  spot.row
#>             <numeric>    <character>   <integer>       <character> <integer>
#> subspot_1.1         1            2,7           1           294,880         7
#> subspot_2.1         2          1,3,8           1       295,587,881         7
#> subspot_3.1         3          2,4,9           1       296,588,882         7
#> subspot_4.1         4           3,10           1       297,589,883         7
#> subspot_5.1         5           6,12           1           298,884         8
#> subspot_6.1         6         5,7,13           1       299,591,885         8
#>              spot.col array_row array_col spatial.cluster
#>             <integer> <numeric> <numeric>       <numeric>
#> subspot_1.1        15   6.66667   14.6667               1
#> subspot_2.1        16   6.66667   15.6667               1
#> subspot_3.1        17   6.66667   16.6667               1
#> subspot_4.1        18   6.66667   17.6667               3
#> subspot_5.1        13   7.66667   12.6667               2
#> subspot_6.1        14   7.66667   13.6667               1

We can plot the enhanced cluster assignments as above.

clusterPlot(melanoma.enhanced)

Enhancing the resolution of gene expression

BayesSpace operates on the principal components of the gene expression matrix, and spatialEnhance() therefore computes enhanced resolution PC vectors. Enhanced gene expression is not computed directly, and is instead imputed using a regression algorithm. For each gene, a model using the PC vectors of each spot is trained to predict the spot-level gene expression, and the fitted model is used to predict subspot expression from the subspot PCs.

Gene expression enhancement is implemented in theenhanceFeatures() function. BayesSpace predicts expression with xgboostby default, but linear and Dirichlet regression are also available via the model argument. When using xgboost, we suggest automatically tuning the nrounds parameter by setting it to 0, although this comes at the cost of increased runtime (~4x slower than a pre-specified nrounds in practice).

enhanceFeatures() can be used to impute subspot-level expression for all genes, or for a subset of genes of interest. Here, we’ll demonstrate by enhancing the expression of four marker genes: PMEL (melanoma), CD2 (T-cells), CD19 (B-cells), and COL1A1 (fibroblasts).

markers <- c("PMEL", "CD2", "CD19", "COL1A1")
melanoma.enhanced <- enhanceFeatures(melanoma.enhanced, melanoma,
                                     feature_names=markers,
                                     nrounds=0)

By default, log-normalized expression (logcounts(sce)) is imputed, although other assays or arbitrary feature matrices can be specified.

logcounts(melanoma.enhanced)[markers, 1:5]
#>        subspot_1.1 subspot_2.1 subspot_3.1 subspot_4.1 subspot_5.1
#> PMEL     2.4383025   1.5922332  2.23760223   3.0099206   2.0412400
#> CD2      0.3343147   0.5033790  0.28379297   0.2837930   0.2837930
#> CD19     0.6677281   0.4452745  0.38306668   0.2275964   0.4397965
#> COL1A1   0.7656322   0.9640545  0.07482389   0.8325472   0.7904220

Diagnostic measures from each predictive model, such asrmse when using xgboost, are added to therowData of the enhanced dataset.

rowData(melanoma.enhanced)[markers, ]
#> DataFrame with 4 rows and 4 columns
#>                gene_id   gene_name    is.HVG enhanceFeatures.rmse
#>            <character> <character> <logical>            <numeric>
#> PMEL   ENSG00000185664        PMEL      TRUE             0.805485
#> CD2    ENSG00000116824         CD2      TRUE             0.635091
#> CD19   ENSG00000177455        CD19      TRUE             0.628285
#> COL1A1 ENSG00000108821      COL1A1      TRUE             0.555659

Visualizing enhanced gene expression

Spatial gene expression is visualized withfeaturePlot().

featurePlot(melanoma.enhanced, "PMEL")

Here, we compare the spatial expression of the imputed marker genes.

enhanced.plots <- purrr::map(markers, function(x) featurePlot(melanoma.enhanced, x))
patchwork::wrap_plots(enhanced.plots, ncol=2)

And we can compare to the spot-level expression.

spot.plots <- purrr::map(markers, function(x) featurePlot(melanoma, x))
patchwork::wrap_plots(c(enhanced.plots, spot.plots), ncol=4)