# tidySpatialExperiment - part of \*tidyomics\* \[!\[Lifecycle:experimental\](https://img.shields.io/badge/lifecycle-experimental-blue.svg)\](https://www.tidyverse.org/lifecycle/#experimental) \[!\[R build status\](https://github.com/william-hutchison/tidySpatialExperiment/workflows/rworkflows/badge.svg)\](https://github.com/william-hutchison/tidySpatialExperiment/actions) # Introduction tidySpatialExperiment provides a bridge between the \[SpatialExperiment\](https://github.com/drighelli/SpatialExperiment) package and the \[\*tidyverse\*\](https://www.tidyverse.org) ecosystem. It creates an invisible layer that allows you to interact with a \`SpatialExperiment\` object as if it were a tibble; enabling the use of functions from \[dplyr\](https://github.com/tidyverse/dplyr), \[tidyr\](https://github.com/tidyverse/tidyr), \[ggplot2\](https://github.com/tidyverse/ggplot2) and \[plotly\](https://github.com/plotly/plotly.R). But, underneath, your data remains a \`SpatialExperiment\` object. tidySpatialExperiment also provides five additional utility functions. ## Resources If you would like to learn more about tidySpatialExperiment and \*tidyomics\*, the following links are a good place to start: - \[The tidySpatialExperiment website\](http://william-hutchison.github.io/tidySpatialExperiment/) - \[The tidyomics website\](https://github.com/tidyomics) The \*tidyomics\* ecosystem also includes packages for: - Working with genomic features: - \[plyranges\](https://github.com/sa-lee/plyranges), for tidy manipulation of genomic range data. - \[nullranges\](https://github.com/nullranges/nullranges), for tidy generation of genomic ranges representing the null hypothesis. - \[plyinteractions\](https://github.com/tidyomics/plyinteractions), for tidy manipulation of genomic interaction data. - Working with transcriptomic features: - \[tidySummarizedExperiment\](https://github.com/stemangiola/tidySummarizedExperiment), for tidy manipulation of \`SummarizedExperiment\` objects. - \[tidySingleCellExperiment\](https://github.com/stemangiola/tidySingleCellExperiment), for tidy manipulation of \`SingleCellExperiment\` objects. - \[tidyseurat\](https://github.com/stemangiola/tidyseurat), for tidy manipulation of \`Seurat\` objects. - \[tidybulk\](https://github.com/stemangiola/tidybulk), for bulk RNA-seq analysis. - Working with cytometry features: - \[tidytof\](https://github.com/keyes-timothy/tidytof), for tidy manipulation of high-dimensional cytometry data. - And a few associated packages: - \[tidygate\](https://github.com/stemangiola/tidygate), for manual gating of points in space. - \[tidyheatmap\](https://github.com/stemangiola/tidyHeatmap/), for modular heatmap contruction. ## Functions and utilities | Package | Functions available | |---------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | \`SpatialExperiment\` | All | | \`dplyr\` | \`arrange\`,\`bind\_rows\`, \`bind\_cols\`, \`distinct\`, \`filter\`, \`group\_by\`, \`summarise\`, \`select\`, \`mutate\`, \`rename\`, \`left\_join\`, \`right\_join\`, \`inner\_join\`, \`slice\`, \`sample\_n\`, \`sample\_frac\`, \`count\`, \`add\_count\` | | \`tidyr\` | \`nest\`, \`unnest\`, \`unite\`, \`separate\`, \`extract\`, \`pivot\_longer\` | | \`ggplot2\` | \`ggplot\` | | \`plotly\` | \`plot\_ly\` | | Utility | Description | |---------------------|----------------------------------------------------------------------------------| | \`as\_tibble\` | Convert cell data to a \`tbl\_df\` | | \`join\_features\` | Append feature data to cell data | | \`aggregate\_cells\` | Aggregate cell-feature abundance into a pseudobulk \`SummarizedExperiment\` object | | \`rectangle\` | Select cells in a rectangular region of space | | \`ellipse\` | Select cells in an elliptical region of space | | \`gate\_spatial\` | | | \`gate\_programmatic\` | | ## Installation You can install the stable version of tidySpatialExperiment from Bioconductor. \`\`\` r if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") BiocManager::install("tidySpatialExperiment") \`\`\` Or, you can install the development version of tidySpatialExperiment from GitHub. \`\`\` r if (!requireNamespace("pak", quietly = TRUE)) install.packages("pak") pak::pak("william-hutchison/tidySpatialExperiment") \`\`\` ## Load data Here, we attach tidySpatialExperiment and an example \`SpatialExperiment\` object. \`\`\` r # Load example SpatialExperiment object library(tidySpatialExperiment) example(read10xVisium) \`\`\` ## SpatialExperiment-tibble abstraction A \`SpatialExperiment\` object represents assay-feature values as rows and cells as columns. Additional information about the cells is stored in the \`reducedDims\`, \`colData\` and \`spatialCoords\` slots. tidySpatialExperiment provides a SpatialExperiment-tibble abstraction, representing cells as rows and cell data as columns, in accordance with the tidy observation-variable convention. The cell data is made up of information stored in the \`colData\` and \`spatialCoords\` slots. The default view is now of the SpatialExperiment-tibble abstraction. \`\`\` r spe # # A SpatialExperiment-tibble abstraction: 50 × 7 # # Features = 50 | Cells = 50 | Assays = counts # .cell in\_tissue array\_row array\_col sample\_id pxl\_col\_in\_fullres # # 1 AAACAACGAATAGTTC-1 FALSE 0 16 section1 2312 # 2 AAACAAGTATCTCCCA-1 TRUE 50 102 section1 8230 # 3 AAACAATCTACTAGCA-1 TRUE 3 43 section1 4170 # 4 AAACACCAATAACTGC-1 TRUE 59 19 section1 2519 # 5 AAACAGAGCGACTCCT-1 TRUE 14 94 section1 7679 # # ℹ 45 more rows # # ℹ 1 more variable: pxl\_row\_in\_fullres \`\`\` But, our data maintains its status as a \`SpatialExperiment\` object. Therefore, we have access to all \`SpatialExperiment\` functions. \`\`\` r spe |> colData() |> head() # DataFrame with 6 rows and 4 columns # in\_tissue array\_row array\_col sample\_id # # AAACAACGAATAGTTC-1 FALSE 0 16 section1 # AAACAAGTATCTCCCA-1 TRUE 50 102 section1 # AAACAATCTACTAGCA-1 TRUE 3 43 section1 # AAACACCAATAACTGC-1 TRUE 59 19 section1 # AAACAGAGCGACTCCT-1 TRUE 14 94 section1 # AAACAGCTTTCAGAAG-1 FALSE 43 9 section1 spe |> spatialCoords() |> head() # pxl\_col\_in\_fullres pxl\_row\_in\_fullres # AAACAACGAATAGTTC-1 2312 1252 # AAACAAGTATCTCCCA-1 8230 7237 # AAACAATCTACTAGCA-1 4170 1611 # AAACACCAATAACTGC-1 2519 8315 # AAACAGAGCGACTCCT-1 7679 2927 # AAACAGCTTTCAGAAG-1 1831 6400 spe |> imgData() # DataFrame with 1 row and 4 columns # sample\_id image\_id data scaleFactor # # 1 section1 lowres #### 0.0510334 \`\`\` # Integration with the \*tidyverse\* ecosystem ## Manipulate with dplyr Most functions from dplyr are available for use with the SpatialExperiment-tibble abstraction. For example, \`filter()\` can be used to filter cells by a variable of interest. \`\`\` r spe |> filter(array\_col < 5) # # A SpatialExperiment-tibble abstraction: 3 × 7 # # Features = 50 | Cells = 3 | Assays = counts # .cell in\_tissue array\_row array\_col sample\_id pxl\_col\_in\_fullres # # 1 AAACATGGTGAGAGGA-1 FALSE 62 0 section1 1212 # 2 AAACGAAGATGGAGTA-1 FALSE 58 4 section1 1487 # 3 AAAGAATGACCTTAGA-1 FALSE 64 2 section1 1349 # # ℹ 1 more variable: pxl\_row\_in\_fullres \`\`\` And \`mutate\` can be used to add new variables, or modify the value of an existing variable. \`\`\` r spe |> mutate(in\_region = c(in\_tissue & array\_row < 10)) # # A SpatialExperiment-tibble abstraction: 50 × 8 # # Features = 50 | Cells = 50 | Assays = counts # .cell in\_tissue array\_row array\_col sample\_id in\_region pxl\_col\_in\_fullres # # 1 AAACAACG… FALSE 0 16 section1 FALSE 2312 # 2 AAACAAGT… TRUE 50 102 section1 FALSE 8230 # 3 AAACAATC… TRUE 3 43 section1 TRUE 4170 # 4 AAACACCA… TRUE 59 19 section1 FALSE 2519 # 5 AAACAGAG… TRUE 14 94 section1 FALSE 7679 # # ℹ 45 more rows # # ℹ 1 more variable: pxl\_row\_in\_fullres \`\`\` ## Tidy with tidyr Most functions from tidyr are also available. Here, \`nest()\` is used to group the data by \`sample\_id\`, and \`unnest()\` is used to ungroup the data. \`\`\` r # Nest the SpatialExperiment object by sample\_id spe\_nested <- spe |> nest(data = -sample\_id) # View the nested SpatialExperiment object spe\_nested # # A tibble: 1 × 2 # sample\_id data # # 1 section1 # Unnest the nested SpatialExperiment objects spe\_nested |> unnest(data) # # A SpatialExperiment-tibble abstraction: 50 × 7 # # Features = 50 | Cells = 50 | Assays = counts # .cell in\_tissue array\_row array\_col sample\_id pxl\_col\_in\_fullres # # 1 AAACAACGAATAGTTC-1 FALSE 0 16 section1 2312 # 2 AAACAAGTATCTCCCA-1 TRUE 50 102 section1 8230 # 3 AAACAATCTACTAGCA-1 TRUE 3 43 section1 4170 # 4 AAACACCAATAACTGC-1 TRUE 59 19 section1 2519 # 5 AAACAGAGCGACTCCT-1 TRUE 14 94 section1 7679 # # ℹ 45 more rows # # ℹ 1 more variable: pxl\_row\_in\_fullres \`\`\` ## Plot with ggplot2 The \`ggplot()\` function can be used to create a plot directly from a \`SpatialExperiment\` object. This example also demonstrates how tidy operations can be combined to build up more complex analysis. \`\`\` r spe |> filter(sample\_id == "section1" & in\_tissue) |> # Add a column with the sum of feature counts per cell mutate(count\_sum = purrr::map\_int(.cell, \~ spe\[, .x\] |> counts() |> sum() )) |> # Plot with tidySpatialExperiment and ggplot2 ggplot(aes(x = reorder(.cell, count\_sum), y = count\_sum)) + geom\_point() + coord\_flip() \`\`\` !\[\](man/figures/unnamed-chunk-11-1.png) ## Plot with plotly The \`plot\_ly()\` function can also be used to create a plot from a \`SpatialExperiment\` object. \`\`\` r spe |> filter(sample\_id == "section1") |> plot\_ly( x = \~ array\_col, y = \~ array\_row, color = \~ in\_tissue, type = "scatter" ) \`\`\` !\[\](man/figures/plotly\_demo.png) # Utilities ## Append feature data to cell data The \*tidyomics\* ecosystem places an emphasis on interacting with cell data. To interact with feature data, the \`join\_features()\` function can be used to append assay-feature values to cell data. \`\`\` r # Join feature data in wide format, preserving the SpatialExperiment object spe |> join\_features(features = c("ENSMUSG00000025915", "ENSMUSG00000042501"), shape = "wide") |> head() # # A SpatialExperiment-tibble abstraction: 50 × 9 # # Features = 6 | Cells = 50 | Assays = counts # .cell in\_tissue array\_row array\_col sample\_id ENSMUSG00000025915 # # 1 AAACAACGAATAGTTC-1 FALSE 0 16 section1 0 # 2 AAACAAGTATCTCCCA-1 TRUE 50 102 section1 0 # 3 AAACAATCTACTAGCA-1 TRUE 3 43 section1 0 # 4 AAACACCAATAACTGC-1 TRUE 59 19 section1 0 # 5 AAACAGAGCGACTCCT-1 TRUE 14 94 section1 0 # # ℹ 45 more rows # # ℹ 3 more variables: ENSMUSG00000042501 , pxl\_col\_in\_fullres , # # pxl\_row\_in\_fullres # Join feature data in long format, discarding the SpatialExperiment object spe |> join\_features(features = c("ENSMUSG00000025915", "ENSMUSG00000042501"), shape = "long") |> head() # tidySpatialExperiment says: A data frame is returned for independent data # analysis. # # A tibble: 6 × 7 # .cell in\_tissue array\_row array\_col sample\_id .feature .abundance\_counts # # 1 AAACAACGAA… FALSE 0 16 section1 ENSMUSG… 0 # 2 AAACAACGAA… FALSE 0 16 section1 ENSMUSG… 0 # 3 AAACAAGTAT… TRUE 50 102 section1 ENSMUSG… 0 # 4 AAACAAGTAT… TRUE 50 102 section1 ENSMUSG… 1 # 5 AAACAATCTA… TRUE 3 43 section1 ENSMUSG… 0 # # ℹ 1 more row \`\`\` ## Aggregate cells Sometimes, it is necessary to aggregate the gene-transcript abundance from a group of cells into a single value. For example, when comparing groups of cells across different samples with fixed-effect models. The \`aggregate\_cells()\` function can be used to aggregate cells by a specified variable and assay, returning a \`SummarizedExperiment\` object. \`\`\` r spe |> aggregate\_cells(in\_tissue, assays = "counts") # class: SummarizedExperiment # dim: 50 2 # metadata(0): # assays(1): counts # rownames(50): ENSMUSG00000002459 ENSMUSG00000005886 ... # ENSMUSG00000104217 ENSMUSG00000104328 # rowData names(1): feature # colnames(2): FALSE TRUE # colData names(3): in\_tissue .aggregated\_cells sample\_id \`\`\` ## Elliptical and rectangular region selection The \`ellipse()\` and \`rectangle()\` functions can be used to select cells by their position in space. \`\`\` r spe |> filter(sample\_id == "section1") |> mutate(in\_ellipse = ellipse(array\_col, array\_row, c(20, 40), c(20, 20))) |> ggplot(aes(x = array\_col, y = array\_row, colour = in\_ellipse)) + geom\_point() \`\`\` !\[\](man/figures/unnamed-chunk-15-1.png) ## Interactive gating For the interactive selection of cells in space, tidySpatialExperiment experiment provides \`gate()\`. This function uses \[tidygate\](https://github.com/stemangiola/tidygate), shiny and plotly to launch an interactive plot overlaying cells in position with image data. Additional parameters can be used to specify point colour, shape, size and alpha, either with a column in the SpatialExperiment object or a constant value. \`\`\` r spe\_gated <- spe |> gate(colour = "in\_tissue", alpha = 0.8) \`\`\` !\[\](man/figures/gate\_interactive\_demo.gif) A record of which points appear in which gates is appended to the SpatialExperiment object in the \`.gated\` column. To select cells which appear within any gates, filter for non-NA values. To select cells which appear within a specific gate, string pattern matching can be used. \`\`\` r # Select cells within any gate spe\_gated |> filter(!is.na(.gated)) # # A SpatialExperiment-tibble abstraction: 4 × 8 # # Features = 50 | Cells = 4 | Assays = counts # .cell in\_tissue array\_row array\_col sample\_id .gated pxl\_col\_in\_fullres # # 1 AAACGAGACGG… TRUE 35 79 section1 2 6647 # 2 AAACTGCTGGC… TRUE 45 67 section1 2 5821 # 3 AAAGGGATGTA… TRUE 24 62 section1 1,2 5477 # 4 AAAGGGCAGCT… TRUE 24 26 section1 1 3000 # # ℹ 1 more variable: pxl\_row\_in\_fullres # Select cells within gate 2 spe\_gated |> filter(stringr::str\_detect(.gated, "2")) # # A SpatialExperiment-tibble abstraction: 3 × 8 # # Features = 50 | Cells = 3 | Assays = counts # .cell in\_tissue array\_row array\_col sample\_id .gated pxl\_col\_in\_fullres # # 1 AAACGAGACGG… TRUE 35 79 section1 2 6647 # 2 AAACTGCTGGC… TRUE 45 67 section1 2 5821 # 3 AAAGGGATGTA… TRUE 24 62 section1 1,2 5477 # # ℹ 1 more variable: pxl\_row\_in\_fullres \`\`\` Details of the interactively drawn gates are saved to \`tidygate\_env$gates\`. This variable is overwritten each time interactive gates are drawn, so save it right away if you would like to access it later. \`\`\` r # Inspect previously drawn gates tidygate\_env$gates |> head() # # A tibble: 6 × 3 # x y .gate # # 1 4310\. 3125\. 1 # 2 3734\. 3161\. 1 # 3 2942\. 3521\. 1 # 4 2834\. 3665\. 1 # 5 2834\. 4385\. 1 # # ℹ 1 more row \`\`\` \`\`\` r # Save if needed tidygate\_env$gates |> write\_rds("important\_gates.rds") \`\`\` If previously drawn gates are supplied to the \`programmatic\_gates\` argument, cells will be gated programmatically. This feature allows the reproduction of previously drawn interactive gates. \`\`\` r important\_gates <- read\_rds("important\_gates.rds") spe |> gate(programmatic\_gates = important\_gates)) |> filter(!is.na(.gated)) \`\`\` # # A SpatialExperiment-tibble abstraction: 4 × 8 # # Features = 50 | Cells = 4 | Assays = counts # .cell in\_tissue array\_row array\_col sample\_id .gated pxl\_col\_in\_fullres # # 1 AAACGAGACGG… TRUE 35 79 section1 2 6647 # 2 AAACTGCTGGC… TRUE 45 67 section1 2 5821 # 3 AAAGGGATGTA… TRUE 24 62 section1 1,2 5477 # 4 AAAGGGCAGCT… TRUE 24 26 section1 1 3000 # # ℹ 1 more variable: pxl\_row\_in\_fullres # Special column behaviour Removing the \`.cell\` column will return a tibble. This is consistent with the behaviour in other \*tidyomics\* packages. \`\`\` r spe |> select(-.cell) |> head() # tidySpatialExperiment says: Key columns are missing. A data frame is # returned for independent data analysis. # # A tibble: 6 × 4 # in\_tissue array\_row array\_col sample\_id # # 1 FALSE 0 16 section1 # 2 TRUE 50 102 section1 # 3 TRUE 3 43 section1 # 4 TRUE 59 19 section1 # 5 TRUE 14 94 section1 # # ℹ 1 more row \`\`\` The \`sample\_id\` column cannot be removed with \*tidyverse\* functions, and can only be modified if the changes are accepted by SpatialExperiment’s \`colData()\` function. \`\`\` r # sample\_id is not removed, despite the user's request spe |> select(-sample\_id) # # A SpatialExperiment-tibble abstraction: 50 × 7 # # Features = 50 | Cells = 50 | Assays = counts # .cell in\_tissue array\_row array\_col sample\_id pxl\_col\_in\_fullres # # 1 AAACAACGAATAGTTC-1 FALSE 0 16 section1 2312 # 2 AAACAAGTATCTCCCA-1 TRUE 50 102 section1 8230 # 3 AAACAATCTACTAGCA-1 TRUE 3 43 section1 4170 # 4 AAACACCAATAACTGC-1 TRUE 59 19 section1 2519 # 5 AAACAGAGCGACTCCT-1 TRUE 14 94 section1 7679 # # ℹ 45 more rows # # ℹ 1 more variable: pxl\_row\_in\_fullres # This change maintains separation of sample\_ids and is permitted spe |> mutate(sample\_id = stringr::str\_c(sample\_id, "\_modified")) |> head() # # A SpatialExperiment-tibble abstraction: 50 × 7 # # Features = 6 | Cells = 50 | Assays = counts # .cell in\_tissue array\_row array\_col sample\_id pxl\_col\_in\_fullres # # 1 AAACAACGAATAGTTC-1 FALSE 0 16 section1\_… 2312 # 2 AAACAAGTATCTCCCA-1 TRUE 50 102 section1\_… 8230 # 3 AAACAATCTACTAGCA-1 TRUE 3 43 section1\_… 4170 # 4 AAACACCAATAACTGC-1 TRUE 59 19 section1\_… 2519 # 5 AAACAGAGCGACTCCT-1 TRUE 14 94 section1\_… 7679 # # ℹ 45 more rows # # ℹ 1 more variable: pxl\_row\_in\_fullres # This change does not maintain separation of sample\_ids and produces an error spe |> mutate(sample\_id = "new\_sample") # # A SpatialExperiment-tibble abstraction: 50 × 7 # # Features = 50 | Cells = 50 | Assays = counts # .cell in\_tissue array\_row array\_col sample\_id pxl\_col\_in\_fullres # # 1 AAACAACGAATAGTTC-1 FALSE 0 16 new\_sample 2312 # 2 AAACAAGTATCTCCCA-1 TRUE 50 102 new\_sample 8230 # 3 AAACAATCTACTAGCA-1 TRUE 3 43 new\_sample 4170 # 4 AAACACCAATAACTGC-1 TRUE 59 19 new\_sample 2519 # 5 AAACAGAGCGACTCCT-1 TRUE 14 94 new\_sample 7679 # # ℹ 45 more rows # # ℹ 1 more variable: pxl\_row\_in\_fullres \`\`\` The \`pxl\_col\_in\_fullres\` and \`px\_row\_in\_fullres\` columns cannot be removed or modified with \*tidyverse\* functions. This is consistent with the behaviour of dimension reduction data in other \*tidyomics\* packages. \`\`\` r # Attempting to remove pxl\_col\_in\_fullres produces an error spe |> select(-pxl\_col\_in\_fullres) # Error in \`select\_helper()\`: # ! Can't select columns that don't exist. # ✖ Column \`pxl\_col\_in\_fullres\` doesn't exist. # Attempting to modify pxl\_col\_in\_fullres produces an error spe |> mutate(pxl\_col\_in\_fullres) # Error in \`dplyr::mutate()\`: # ℹ In argument: \`pxl\_col\_in\_fullres\`. # Caused by error: # ! object 'pxl\_col\_in\_fullres' not found \`\`\` # Citation If you use tidySpatialExperiment in published research, please cite \[The tidyomics ecosystem: enhancing omic data analyses\](https://doi.org/10.1038/s41592-024-02299-2).