Auxillary parameter recovery experiment (original) (raw)

In some situations in may be useful to track information other than the data generated from generate. The example demonstrated in this working file relates to recovering latent trait score (also known as factor scores or abilities) from item response theory models. The focus here is how various test characteristics and estimation methods affect the precision of the \(N\) recovered estimates. The difference between this and other simulations is that, in addition to analyzing and tracking the data generated from generate, we also want to track information that was used to generate said data (i.e., the latent trait scores).

Define conditions

To keep this example simple, we will only investigate the effects of varying the number of items in the test and population distribution of the latent traits. All other parameters will be sampled from some internal generating functions in R.

library(SimDesign)
# SimFunctions()

Design <- createDesign(test_length = c(10, 20, 30), 
                       distribution = c('normal', 'bimodal'))

Define the functions

As usual, define the functions of interest. Here we make sure that mirt is loaded in when required.

Generate <- function(condition, fixed_objects = NULL) {
    Attach(condition) # attaches condition names for direct reference
    
    #notice that test_length can be called directly now because of Attach()
    a <- matrix(rlnorm(test_length, .2, .3)) 
    d <- matrix(rnorm(test_length, 0, .5))
    N <- 300
    
    if(distribution == 'normal'){
        Theta <- matrix(scale(rnorm(N)))
    } else if(distribution == 'bimodal'){
        Theta <- scale(c(rnorm(N/2, 1, 0.5), rnorm(N/2, -1, 0.5)))
    }
    dat <- simdata(a=a, d=d, Theta=Theta, itemtype = 'dich')
    
    ret <- list(dat=dat, parameters=list(Theta=Theta, a=a, d=d))
    ret
}

Analyse <- function(condition, dat, fixed_objects = NULL) {
    
    bimodal_prior <- function(Theta, ...) as.numeric(dnorm(Theta, -1, 0.5) + dnorm(Theta, 1, 0.5))
    
    # extract
    data <- dat$dat
    parameters <- dat$parameters
    
    # assign population values for complete model
    sv <- mirt(data, 1, pars = 'values')
    sv$value[sv$name == 'a1'] <- parameters$a
    sv$value[sv$name == 'd'] <- parameters$d
    mod <- mirt(data, 1, pars = sv, TOL = NaN)
    
    EAPscore <- fscores(mod, full.scores=TRUE)
    MAPscore <- fscores(mod, method = 'MAP', full.scores=TRUE)
    MLscore <- fscores(mod, method = 'ML', full.scores=TRUE)
    EAPscore_bimodal <- fscores(mod, method = 'EAP', full.scores=TRUE, custom_den = bimodal_prior)
    
    # return list (also with true Theta's)
    ret <- list(Theta=parameters$Theta, EAP=EAPscore, MAP=MAPscore, 
                ML=MLscore, EAP_bimodal=EAPscore_bimodal)
    ret
}

Summarise <- function(condition, results, fixed_objects = NULL) {
    
    # for ease of calculations, find the difference between the observed and population matrices
    index <- 1:length(results)
    diff_EAP <- do.call(c, lapply(index, function(ind, res) res[[ind]]$EAP - res[[ind]]$Theta,
                   res=results))
    diff_MAP <- do.call(c, lapply(index, function(ind, res) res[[ind]]$MAP - res[[ind]]$Theta, 
                   res=results))
    diff_ML <- do.call(c, lapply(index, function(ind, res) res[[ind]]$ML - res[[ind]]$Theta, 
                   res=results))
    diff_ML <- diff_ML[is.finite(diff_ML)]
    diff_EAP_bimodal <- do.call(c, lapply(index, function(ind, res) 
        res[[ind]]$EAP_bimodal - res[[ind]]$Theta, res=results))
    
    bias_EAP <- bias(diff_EAP)
    bias_MAP <- bias(diff_MAP)
    bias_ML <- bias(diff_ML)
    bias_EAP_bimodal <- bias(diff_EAP_bimodal)
    RMSE_EAP <- RMSE(diff_EAP)
    RMSE_MAP <- RMSE(diff_MAP)
    RMSE_ML <- RMSE(diff_ML)
    RMSE_EAP_bimodal <- RMSE(diff_EAP_bimodal)
    
    ret <- c(bias_EAP=bias_EAP, bias_MAP=bias_MAP, bias_ML=bias_ML, bias_EAP_bimodal=bias_EAP_bimodal,
             RMSE_EAP=RMSE_EAP, RMSE_MAP=RMSE_MAP, RMSE_ML=RMSE_ML, RMSE_EAP_bimodal=RMSE_EAP_bimodal)
    ret
}

The difference in this simulation compared to the previous approaches is that generate and analyse return lists rather than the usual data-based structures or vectors. This allows more general objects to be returned if need be (because lists can contain any number of elements), though this comes at a small cost. When the summarise function is finally called, list objects cannot be easily simplified a priori, therefore the results object will be a list with the same length as the number of replications.

Another slightly different approach here appears in the summarise function, where instead of supplying elements of observed and population values to bias() and RMSE() an object containing the difference between the observed and population values are supplied. These functions recognize that when only one input is supplied the objects are in a population difference form, and therefore compute the respective statistics. Sometimes this is easier than supplying every element to the functions individually.

Run the simulation

The following function call should be familiar to you by now.

res <- runSimulation(Design, replications = 5, parallel = FALSE, packages = 'mirt',
                     generate=Generate, analyse=Analyse, summarise=Summarise)


Design: 1/6;   Replications: 5;   RAM Used: 150.7 Mb;   Total Time: 0.00s 
 Conditions: test_length=10, distribution=normal


Design: 2/6;   Replications: 5;   RAM Used: 158.8 Mb;   Total Time: 3.70s 
 Conditions: test_length=20, distribution=normal


Design: 3/6;   Replications: 5;   RAM Used: 158.9 Mb;   Total Time: 8.97s 
 Conditions: test_length=30, distribution=normal


Design: 4/6;   Replications: 5;   RAM Used: 159 Mb;   Total Time: 15.20s 
 Conditions: test_length=10, distribution=bimodal


Design: 5/6;   Replications: 5;   RAM Used: 159 Mb;   Total Time: 18.26s 
 Conditions: test_length=20, distribution=bimodal


Design: 6/6;   Replications: 5;   RAM Used: 159.1 Mb;   Total Time: 23.85s 
 Conditions: test_length=30, distribution=bimodal

Simulation complete. Total execution time: 30.16s
# A tibble: 6 × 15
  test_length distribution   bias_EAP   bias_MAP    bias_ML bias_EAP_bimodal
        <dbl> <chr>             <dbl>      <dbl>      <dbl>            <dbl>
1          10 normal       -0.0015501 -0.0013642  0.014744        -0.0045993
2          20 normal        0.0093989  0.0092882  0.011625         0.0085493
3          30 normal       -0.0063332 -0.0052552 -0.0069644       -0.0056860
4          10 bimodal       0.019335   0.024511   0.036004         0.013520 
5          20 bimodal      -0.0088995 -0.011185  -0.0090255       -0.0050346
6          30 bimodal       0.026843   0.026776   0.028189         0.022670 
# ℹ 9 more variables: RMSE_EAP <dbl>, RMSE_MAP <dbl>, RMSE_ML <dbl>,
#   RMSE_EAP_bimodal <dbl>, REPLICATIONS <dbl>, SIM_TIME <chr>, RAM_USED <chr>,
#   SEED <int>, COMPLETED <chr>

Analyze the results

When the \(\theta\) estimates were generate from a normal distribution it appeared that all latent trait estimates were fairly unbiased across random \(\theta\) sets, however ML estimation generally provided the highest RMSE (this is reasonable, especially in settings where the Bayesian estimates selected a good prior). As the test length increased, all estimates improved in their overall precision to recover the population parameters, which makes sense given that longer tests provide more information about an individuals’ ability.

When the distribution of the \(\theta\)’s were bimodal the same effect occurred, however for the EAP estimates the overall RMSE was actually slightly less than when the correct prior was provided for the population (the default in mirt is \(N(0,1)\)). ML estimates were largely unaffected by altering the distribution, mainly because estimates are not influenced by prior beliefs. Finally, when the prior distribution was the same as the population generating model the EAP estimates demonstrated the lowest RMSEs.

Note that this analysis was about population effects across all sampled ability levels. In general, these estimators may perform differently for specific abilities (i.e., ML estimation may do much worse as \(|\theta|\) becomes larger) and when different priors are used. Depending on what the focus of the Monte Carlo study is, this topic may be of more importance than overall bias and RMSE (see the mirtCAT package and the associated JSS paper for further details on this aspect of IRT models).