GitHub - fmicompbio/sketchR: An R interface to the subsampling algorithms implemented in python (original) (raw)
sketchR
sketchR
provides a simple interface to the geosketch andscSamplerpython packages, which implement subsampling algorithms described inHie et al (2019)and Song et al (2022), respectively. The implementation makes use of thebasiliskpackage for interaction between R and python.
Installation
You can install sketchR
from Bioconductor (release 3.19 onwards) using:
if (!require("BiocManager", quietly = TRUE)) install.packages("BiocManager")
BiocManager::install("sketchR")
Example
library(sketchR)
Create an example data matrix. Rows represent "samples" (the unit of
downsampling), columns represent features (e.g., principal components).
mat <- matrix(rnorm(5000), nrow = 500)
Run geosketch. The output is a vector of indices, which you can use
to subset the rows of the input matrix.
idx <- geosketch(mat, N = 100)
Run scSampler. As for geosketch, the output is a vector of indices.
idx2 <- scsampler(mat, N = 100)