GitHub - Muunraker/nipalsMCIA (original) (raw)

nipalsMCIA: Software to Compute Multi-Block Dimensionality Reduction

BioC status R-CMD-check

This package computes Multiple Co-Inertia Analysis (MCIA) on multi-block data using the Nonlinear Iterative Partial Least Squares (NIPALS) method.

Features include:

Citation

For more information on the methodology used in nipalsMCIA and to cite, please see

Installation

This package can be installed via Bioconductor:

if (!require("BiocManager", quietly = TRUE)) install.packages("BiocManager")

BiocManager::install("nipalsMCIA")

You can install the development version of nipalsMCIA fromGitHub with:

install.packages("devtools")

devtools::install_github("Muunraker/nipalsMCIA", ref = "devel", force = TRUE, build_vignettes = TRUE)

Basic Example

The package currently includes one test dataset: data_blocks. This is a list of dataframes containing observations of variables from three omics types (mRNA, proteins, and micro RNA) on 21 cancer cell lines from the NCI60 cancer cell lines. The data file includes a metadata data frame containing the cancer type associated with each cell line.

load the package and set a seed for reproducibility

library(nipalsMCIA) set.seed(42)

data(NCI60) # import data as "data_blocks" and metadata as "metadata_NCI60"

examine the data and metadata

summary(data_blocks) #> Length Class Mode #> mrna 12895 data.frame list #> miRNA 537 data.frame list #> prot 7016 data.frame list head(metadata_NCI60) #> cancerType #> CNS.SF_268 CNS #> CNS.SF_295 CNS #> CNS.SF_539 CNS #> CNS.SNB_19 CNS #> CNS.SNB_75 CNS #> CNS.U251 CNS table(metadata_NCI60) #> cancerType #> CNS Leukemia Melanoma #> 6 6 9

Note: this dataset is reproduced from the omicade4 package(Meng et. al., 2014). This package assumes all input datasets are in sample by feature format.

The main MCIA function can be called on data_blocks and optionally can include metadata_NCI60 for plot coloring by cancer type:

to convert data_blocks into an MAE object we provide the simple_mae() function

data_blocks_mae <- simple_mae(data_blocks, row_format = "sample", colData = metadata_NCI60)

mcia_results <- nipals_multiblock(data_blocks_mae = data_blocks_mae, col_preproc_method = "colprofile", num_PCs = 10, tol = 1e-12, color_col = "cancerType")

Here num_PCs is the dimension of the low-dimensional embedding of the data chosen by the user.