A study of compensatory techniques which address missing data problems (original) (raw)

2004

Abstract

Standard software packages by default implement methods such as listwise deletion that simply drops cases that have missing values. This study tested three techniques (multiple imputation, mean substitution, and listwise deletion) used to remedy problems associated with missing data. Data from the revised General Social Survey from 1993 were used in this study. Four variables were selected for inclusion in this study: age, education, socioeconomic status, and number of hours of TV viewing. A total of 30 samples (10 each with a sample size of 50, 100, and 200) were randomly selected from the 1,500 cases in this database using SPSS - Windows, ver. 12.0. From these samples, additional samples were generated with 10%, 30%, and 50% of values randomly deleted using a random number generator. These data manipulations produced 40 samples for each sample size. The compensatory techniques (listwise deletion, mean substitution, and multiple imputation) were applied to every sample, with summary statistics (e.g., means, standard deviations, medians, and minimum and maximum scores) generated for each sample. Regression analyses were completed for each of the samples, with number of TV hours used as the criterion variable, and age, education, and socioeconomic status used as predictor variables. Means and standard deviations of the R2s for each of the sample sizes and compensatory techniques were obtained to allow comparisons across the samples. ^ Mean ratios were computed for each missing value condition to determine the degree to which each technique effectively compensated for missing values. To determine the mean ratios for the mean scores, the mean of each sample was compared to the mean of its original sample. A similar method was used to compare the R2s for each of the missing value conditions with the R2 for the original sample. In addition, a 3 x 3 factorial analysis of variance was used to test for differences between the main effects of compensatory technique and percent of missing values. Mean substitution appeared to produce estimates that most nearly emulated that of the original sample. ^

a jabari hasn't uploaded this paper.

Let a know you want this paper to be uploaded.

Ask for this paper to be uploaded.