fix: fixed an issue of a solitary NA as var_labels passed to analyze by kpagacz · Pull Request #965 · insightsengineering/rtables (original) (raw)
Here's an MRE of the issue:
library(rtables)
devtools::load_all()
lyt <- basic_table() %>% split_cols_by("ARM") %>% analyze("AGE", var_labels = c(AGE = NA))
which results in:
Error in validObject(.Object) :
invalid class “AnalyzeVarSplit” object: invalid object for slot "split_l
abel" in class "AnalyzeVarSplit": got class "logical", should be or extend
class "character"
R treats a single NA as a logical vector, which fails the validation on the Split object, which requires the labels slot to be a character vector. This issue does not happen when the var_labels array contains other character values. The untyped NAs are cast to characters automatically in this case.
This also does not happen in this scenario:
library(rtables)
devtools::load_all()
lyt <- basic_table() %>% split_cols_by("ARM") %>% analyze("AGE", var_labels = c(AGE = NA_character_))
What prompted this is the investigation of what happens when launching the below teal application:
require(teal.modules.general) require(teal.modules.clinical) data <- teal_data() data <- within(data, { ADSL <- teal.modules.general::rADSL
Remove the label attribute to simulate a dataset without labels
attributes(ADSL$SEX)$label <- NULL ADSL$EOSDY[1] <- NA_integer_ }) join_keys(data) <- default_cdisc_join_keys["ADSL"]
ADSL <- data[["ADSL"]]
app <- init( data = data, modules = modules( teal.modules.clinical::tm_t_summary( label = "Demographic Table", dataname = "ADSL", arm_var = choices_selected(c("ARM", "ARMCD"), "ARM"), add_total = TRUE, summarize_vars = choices_selected( c("SEX", "RACE", "BMRKR2", "EOSDY", "DCSREAS", "AGE"), c("SEX") ), useNA = "ifany" ) ) ) if (interactive()) { shinyApp(app$ui, app$server) }
This application errors out with the above rtables error, because the application tries to evaluate the below:
library(nvimcom) library(ggplot2) library(ggmosaic) library(shiny) library(teal.code) library(teal.data) library(teal.slice) library(teal) library(teal.transform) library(teal.modules.general) library(formatters) library(magrittr) library(rtables) library(tern) library(teal.modules.clinical)
ADSL <- teal.modules.general::rADSL attributes(ADSL$SEX)$label <- NULL ADSL$EOSDY[1] <- NA
stopifnot(rlang::hash(ADSL) == "fdef4b3486c80ad054476ea2b4131889")
ANL_1 <- ADSL %>% dplyr::select(STUDYID, USUBJID, ARM, SEX) ANL <- ANL_1 ANL <- ANL %>% teal.data::col_relabel(ARM = "Description of Planned Arm") ANL_ADSL_1 <- ADSL %>% dplyr::select(STUDYID, USUBJID, ARM) ANL_ADSL <- ANL_ADSL_1 ANL_ADSL <- ANL_ADSL %>% teal.data::col_relabel(ARM = "Description of Planned Arm") anl <- ANL %>% df_explicit_na( omit_columns = setdiff(names(ANL), c(structure(c(SEX = "SEX"), dataname = "ADSL", always_selected = character(0)))), na_level = "" ) anl <- anl %>% dplyr::mutate(ARM = droplevels(ARM)) arm_levels <- levels(anl[["ARM"]]) ANL_ADSL <- ANL_ADSL %>% dplyr::filter(ARM %in% arm_levels) ANL_ADSL <- ANL_ADSL %>% dplyr::mutate(ARM = droplevels(ARM)) ANL_ADSL <- df_explicit_na(ANL_ADSL, na_level = "") lyt <- rtables::basic_table( main_footer = "n represents the number of unique subject IDs such that the variable has non-NA values." ) %>% rtables::split_cols_by("ARM", split_fun = drop_split_levels) %>% rtables::add_overall_col("All Patients") %>% rtables::add_colcounts() %>% analyze_vars( vars = structure(c(SEX = "SEX"), dataname = "ADSL", always_selected = character(0)), var_labels = c(SEX = NA), show_labels = "visible", na.rm = FALSE, na_str = "", denom = "N_col", .stats = c("n", "mean_sd", "mean_ci", "geom_mean", "median", "median_ci", "quantiles", "range", "count_fraction") ) result <- rtables::build_table(lyt = lyt, df = anl, alt_counts_df = ANL_ADSL) result
This fails due to solitary NA in the analyze_vars call, passed directly to rtables::analyze.
Unfortunately, trying to fix the issue on the teal.modules.clinical side is unsuccessful due to how substitute treats such solitary NAs (basically casting them from the explicit NA_character_ to normal NA). The fix could be made on the level of tern, but it seems silly not to go to the bottom, which is rtables::analyze, and deeper - AnalyzeMultiVars, etc...
Let me know if this makes sense for you or if you would like me to take a different route.