# chevreulProcess This package includes functions for processing single cell RNA datasets processed as SingleCellExperiments A demo with a developing human retina scRNA-seq dataset from Shayler et al. is available here There are also convenient functions for: - Clustering and Dimensional Reduction of Raw Sequencing Data. - Integration and Label Transfer - Louvain Clustering at a Range of Resolutions - Cell cycle state regression and labeling ## Installation You can install the released version of chevreulProcess from github with: ### Install locally and run in three steps: You can install chevreulProcess locally using the following steps: ## Installation instructions \`Chevreul\` depends on a minimum R version \\>=4.4 Get the latest stable \`R\` release from \[CRAN\](http://cran.r-project.org/). Then install \`Chevreul\` and its dependencies using the following code: \`\`\` r install.packages("BiocManager") BiocManager::install("chevreulProcess") chevreulProcess::create\_project\_db() \`\`\` You can also customize the location of the app using these steps: \`\`\` r install.packages("BiocManager") BiocManager::install("chevreulProcess") chevreulProcess::create\_project\_db(destdir = "/your/path/to/app") \`\`\` ## Getting Started First, load chevreulProcess and all other packages required \`\`\` r library(chevreulProcess) library(SingleCellExperiment) library(tidyverse) library(ggraph) \`\`\` ## TLDR chevreulProcess provides a single command to: - construct a SingleCellExperiment object - filter genes by minimum expression and ubiquity - normalize and scale expression by any of several methods packaged in SingleCellExperiment ## Run clustering on a single object By default clustering will be run at ten different resolutions between 0.2 and 2.0\. Any resolution can be specified by providing the resolution argument as a numeric vector. \`\`\` r data("small\_example\_dataset") clustered\_sce <- sce\_process(small\_example\_dataset, experiment\_name = "sce\_hu\_trans", organism = "human" ) \`\`\` Chevreul includes tools for: - Louvain clustering at a range of resolutions - Dimensionality reduction of raw sequencing data. - Integration (batch correction) of multiple datasets ### Troubleshooting installation #### Dependency management When installing an R package like Chevreul with many dependencies, conflicts with existing installations can arise. This is a common issue in R package management. Here are some strategies to address this problem: 1\. Consider renv for dependency management. This tool creates isolated environments for each project, ensuring that package versions don’t conflict across different projects. 2\. Use the conflicted Package The conflicted package provides an alternative conflict resolution strategy. It makes every conflict an error, forcing you to choose which function to use #### Slow internet connection When installing R packages on slow internet connections, several issues can arise, particularly with larger packages or when using functions like remotes::install\_github(). Here are some strategies to address bandwidth-related problems: Set a longer timeout for downloads: \`options(timeout = 9999999)\` Specify the download method: \`options(download.file.method = "libcurl")\` ## Transcript-level quantification For transcript-level analysis, users must incorporate transcript-level data into the SingleCellExperiment object as an alternative experiment before initiating the Chevreul processing pipeline. This step is crucial for enabling detailed exploration at the transcript level. Transcripts may be quantified using any of several available methods, including alignment-free methods best used with well-annotated transcriptomes (Salmon, Kallisto), alignment-based methods best used to detect novel isoforms (StringTie2), or long-read methods for use with long-read sequencing data (IsoQuant). ## Integration implementation The \`sce\_integrate()\` function in Chevreul implements integration (batch correction) of scRNA-seq datasets by using the batchelor package. It accepts a list of SingleCellExperiment objects as input for integration and stores the corresponding batch information in a metadata field named ‘batch’. By default, it employs batchelor’s \`correctExperiments\` function to preserve pre-existing data structures and metadata from input SingleCellExperiment objects within the integrated output. ## Hardware requirements Recommended minimum hardware requirements for running Chevreul are as follows: - RAM: A minimum of 16 GB RAM is recommended for initial analysis. However, for larger datasets or more complex analyses, 64 GB or more is advisable. - CPU: Having multiple cores can be beneficial for parallel processing. - Storage: Sufficient storage space is necessary, especially for temporary files. The exact amount depends on the size of your datasets - R Version: Chevruel requires R version 4.4 or greater It’s important to note that these requirements can vary based on the size and complexity of your dataset. As the number of cells increases, so do the hardware requirements. For instance: A dataset with around 8,000 cells can be analyzed with 8 GB of RAM. For larger datasets or more complex analyses, 64-128 GB of RAM can be beneficial. ## Learn More To learn more about the usage of Bioconductor tools for single-cell RNA-seq analysis. Consult the book Orchestrating Single-Cell Analysis with Bioconductor. The book walks through common workflows for the analysis of single-cell RNA-seq data (scRNA-seq). This book will show you how to make use of cutting-edge Bioconductor tools to process, analyze, visualize, and explore scRNA-seq data