Load in NCBI GEO data formatted as single file per sample — Read_GEO_Delim (original) (raw)
Can read delimited file types (i.e. csv, tsv, txt)
Read_GEO_Delim(
data_dir,
file_suffix,
move_genes_rownames = TRUE,
sample_list = NULL,
full_names = FALSE,
sample_names = NULL,
barcode_suffix_period = FALSE,
parallel = FALSE,
num_cores = NULL,
merge = FALSE
)
Arguments
data_dir
Directory containing the files.
file_suffix
The file suffix of the individual files. Must be the same across all files being imported. This is used to detect files to import and their GEO IDs.
move_genes_rownames
logical. Whether gene IDs are present in first column or in row names of delimited file. If TRUE will move the first column to row names before creating final matrix. Default is TRUE.
sample_list
a vector of samples within directory to read in (can be either with or without file_suffix
see full_names
). If NULL will read in all subdirectories.
full_names
logical (default FALSE). Whether or not the sample_list
vector includes the file suffix. If FALSE
the function will add suffix based on file_suffix
parameter.
sample_names
a set of sample names to use for each sample entry in returned list. If NULL
will set names to the directory name of each sample.
barcode_suffix_period
Is the barcode suffix a period and should it be changed to "-". Default (FALSE; barcodes will be left identical to their format in input files.). If TRUE "." in barcode suffix will be changed to "-".
parallel
logical (default FALSE). Whether to use multiple cores when reading in data. Only possible on Linux based systems.
num_cores
if parallel = TRUE
indicates the number of cores to use for multicore processing.
merge
logical (default FALSE) whether or not to merge samples into a single matrix or return list of matrices. If TRUE each sample entry in list will have cell barcode prefix added. The prefix will be taken from sample_names
.
Value
List of gene x cell matrices in list format named by sample name.
Examples
if (FALSE) {
data_dir <- 'path/to/data/directory'
expression_matrices <- Read_GEO_Delim(data_dir = data_dir)
}