GitHub - fmicompbio/sketchR: An R interface to the subsampling algorithms implemented in python (original) (raw)

sketchR

sketchR

R-CMD-check

sketchR provides a simple interface to the geosketch andscSamplerpython packages, which implement subsampling algorithms described inHie et al (2019)and Song et al (2022), respectively. The implementation makes use of thebasiliskpackage for interaction between R and python.

Installation

You can install sketchR from Bioconductor (release 3.19 onwards) using:

if (!require("BiocManager", quietly = TRUE)) install.packages("BiocManager")

BiocManager::install("sketchR")

Example

library(sketchR)

Create an example data matrix. Rows represent "samples" (the unit of

downsampling), columns represent features (e.g., principal components).

mat <- matrix(rnorm(5000), nrow = 500)

Run geosketch. The output is a vector of indices, which you can use

to subset the rows of the input matrix.

idx <- geosketch(mat, N = 100)

Run scSampler. As for geosketch, the output is a vector of indices.

idx2 <- scsampler(mat, N = 100)