Get Comparison Result between Signature Groups — get_group_comparison (original) (raw)
Compare genotypes/phenotypes based on signature groups (samples are assigned to several groups). For categorical type, calculate fisher p value (using stats::fisher.test) and count table. In larger than 2 by 2 tables, compute p-values by Monte Carlo simulation. For continuous type, calculate anova p value (using stats::aov), summary table and Tukey Honest significant difference (using stats::TukeyHSD). The result of this function can be plotted by [show_group_comparison()](show%5Fgroup%5Fcomparison.html)
.
get_group_comparison(
data,
col_group,
cols_to_compare,
type = "ca",
NAs = NA,
verbose = FALSE
)
Arguments
a data.frame
containing signature groups and genotypes/phenotypes (including categorical and continuous type data) want to analyze. User need to construct this data.frame
by him/herself.
column name of signature groups.
column names of genotypes/phenotypes want to summarize based on groups.
a characater vector with length same as cols_to_compare
, 'ca' for categorical type and 'co' for continuous type.
default is NA
, filter NA
s for categorical columns. Otherwise a value (either length 1 or length same as cols_to_compare
) fill NA
s.
if TRUE
, print extra information.
Value
a list
contains data, summary, p value etc..
Examples
# \donttest{
load(system.file("extdata", "toy_copynumber_signature_by_W.RData",
package = "sigminer", mustWork = TRUE
))
# Assign samples to clusters
groups <- get_groups(sig, method = "k-means")
#> ℹ [2024-08-04 14:38:52.918196]: Started.
#> ✔ [2024-08-04 14:38:52.919734]: 'Signature' object detected.
#> ℹ [2024-08-04 14:38:52.924756]: Running k-means with 2 clusters...
#> ℹ [2024-08-04 14:38:52.928183]: Generating a table of group and signature contribution (stored in 'map_table' attr):
#> Sig1 Sig2
#> 1 0.2097559 0.7901116
#> 2 0.8964984 0.1035016
#> ℹ [2024-08-04 14:38:52.929644]: Assigning a group to a signature with the maximum fraction...
#> ℹ [2024-08-04 14:38:52.932974]: Summarizing...
#> group #1: 2 samples with Sig2 enriched.
#> group #2: 8 samples with Sig1 enriched.
#> ! [2024-08-04 14:38:52.934935]: The 'enrich_sig' column is set to dominant signature in one group, please check and make it consistent with biological meaning (correct it by hand if necessary).
#> ℹ [2024-08-04 14:38:52.936428]: 0.018 secs elapsed.
set.seed(1234)
groups$prob <- rnorm(10)
groups$new_group <- sample(c("1", "2", "3", "4", NA), size = nrow(groups), replace = TRUE)
# Compare groups (filter NAs for categorical coloumns)
groups.cmp <- get_group_comparison(groups[, -1],
col_group = "group",
cols_to_compare = c("prob", "new_group"),
type = c("co", "ca"), verbose = TRUE
)
#> Treat prob as continuous variable.
#> Treat new_group as categorical variable.
# Compare groups (Set NAs of categorical columns to 'Rest')
groups.cmp2 <- get_group_comparison(groups[, -1],
col_group = "group",
cols_to_compare = c("prob", "new_group"),
type = c("co", "ca"), NAs = "Rest", verbose = TRUE
)
#> Treat prob as continuous variable.
#> Treat new_group as categorical variable.
# }