Increasing Survey Statistics Precision Using Split Questionnaire Design : An Application of Small Area Estimation (original) (raw)

An Efficient Method for Estimating Population Parameters Using Split Questionnaire Design

Journal of Statistical Research of Iran

The effect of survey questionnaire length on precision of survey statistics has been discussed in several studies. It is generally concluded that the lengthy questionnaire leads to increase non-sampling errors, especially nonresponse rate. Split questionnaire method has been introduced as a solution to decrease the response burden and nonresponse rate, involves splitting the questionnaire into subquestionnaires and then administering these subquestionnaires to different subsets of the original sample. In this paper, we suggest a method for splitting long questionnaire and analyzing resulting data, using small area estimation. The general idea behind this approach is to construct some socio-demographic or geographic small areas to apply small area estimation to improve the efficiency of survey statistics. Our new approach is supported by a simulation study based on a real dataset of the 2011 Iran Income and Expenditure survey, in which we show our method provides more reliable statistics than existing methods.

Design and Estimation for Split Questionnaire Surveys

2008

When sampling from a finite population to estimate the means or totals of K population characteristics of interest, survey designs typically impose the constraint that information on all K characteristics (or data items) is collected from all units in the sample. Relaxing this constraint means that information on a subset of the K data items may be collected from any given unit in the sample. Such a design, called a split questionnaire design (SQD), has three advantages over the typical design: increased efficiency with which design objectives can be met, by allowing the number of sample units from which information on a particular data item is collected to vary; improved efficiency in estimation through exploiting the correlation between the K data items; and flexibility to restrict the maximum number of data items collected from a unit to be less than K. An SQD can be viewed as designing for the missing pattern of data. In the simple case of two variables this paper considers estimators, including the Best Linear Unbiased Estimator (BLUE), for a SQD. The results show that significant gains can be achieved. The size of the gains of SQD depend upon the function describing the survey costs, the design constraints, and the covariance matrix of the data items of interest. These methods are evaluated in a simulation study with four data items.

Reduction of Measurement Error due to Survey Length: Evaluation of the Split Questionnaire Design Approach

Survey research methods, 2017

Long survey instruments can be taxing to respondents, which may result in greater measurement error. There is little empirical evidence on the relationship between length and measurement error, possibly leading to longer surveys than desirable. At least equally important is the need for methods to reduce survey length while meeting the survey’s objectives. This study tests the ability to reduce measurement error related to survey length through split questionnaire design, in which the survey is modularized and respondents are randomly assigned to receive subsets of the survey modules. The omitted questions are then multiply imputed for all respondents. The imputation variance, however, may overwhelm any benefits to survey estimates from the reduction of survey length. We use an experimental design to further evaluate the effect of survey length on measurement error and to examine the degree to which a split questionnaire design can yield estimates with less measurement error. We fou...

Split Questionnaire Design for Massive Surveys

Journal of Marketing Research, 2008

We start describing the procedure that is used to generate the between-block designs. We assume that if there are N individuals and Q questions, then N/K individuals will be assigned randomly to each of the K splits. Each alternative split questionnaire design then consists of an N x Q matrix D with K different split patterns. Each entry in the matrix D is a 0 or 1, indicating whether a question is included or excluded in that particular split. In constructing between-block designs, we constrain all questions in one block to be assigned to the same respondent. That is, if we have five blocks with four questions and one particular split at the block-level is [11010], we will use d ij =[1111 1111 0000 1111 0000] as a row in the design matrix D. The proposed procedure to construct split questionnaire designs operates as follows.

Split Questionnaire Designs: collecting only the data that you need through MCAR and MAR designs

Journal of Applied Statistics

We call a sample design that allows for different patterns, or sets, of data items to be collected from different sample units a Split Questionnaire Design (SQD). SQDs can be thought of as incorporating missing data into survey design. This paper examines the situation where data that are not collected by an SQD can be treated as Missing Completely At Random or Missing At Random, targets are regression coefficients in a generalised linear model fitted to binary variables, and targets are estimated using Maximum Likelihood. A key finding is that it can be easy to measure the relative contribution of a respondent to the accuracy of estimated model parameters before collecting all the respondent's model covariates. We show empirically and theoretically that we could achieve a significant reduction in respondent burden with a negligible impact on the accuracy of estimates by not collecting model covariates from respondents who we identify as contributing little to the accuracy of estimates. We discuss the general implications for SQDs.

The effect of questionnaire length on survey response

Quality and Quantity, 1992

Survey textbooks suggest that long questionnaires should be avoided, and a careful reading of the available empirical evidence confirms the negative effects of substantial length on both response rates and the quality of those responses which are obtained. Data is presented from a lengthy survey in Britain in 1987. Analysis of reasons for nonresponse to this survey suggest that length may indeed have been a significant disincentive to respond for many. However, no effect of length was found on item quality as measured by the number of responses given to open-ended questions. Unexpectedly, the variance in number of responses was greater when the questions were asked later in the questionnaire. The results are interpreted as resulting from the greater power that respondents gain as the survey proceeds.

On a Modular Approach to the Design of Integrated Social Surveys

Journal of Official Statistics, 2016

This article considers a modular approach to the design of integrated social surveys. The approach consists of grouping variables into ‘modules’, each of which is then allocated to one or more ‘instruments’. Each instrument is then administered to a random sample of population units, and each sample unit responds to all modules of the instrument. This approach offers a way of designing a system of integrated social surveys that balances the need to limit the cost and the need to obtain sufficient information. The allocation of the modules to instruments draws on the methodology of split questionnaire designs. The composition of the instruments, that is, how the modules are allocated to instruments, and the corresponding sample sizes are obtained as a solution to an optimisation problem. This optimisation involves minimisation of respondent burden and data collection cost, while respecting certain design constraints usually encountered in practice. These constraints may include, for ...

Three-Form Split Questionnaire Design for Panel Surveys

Journal of Official Statistics

Longitudinal or panel surveys are effective tools for measuring individual level changes in the outcome variables and their correlates. One drawback of these studies is dropout or nonresponse, potentially leading to biased results. One of the main reasons for dropout is the burden of repeatedly responding to long questionnaires. Advancements in survey administration methodology and multiple imputation software now make it possible for planned missing data designs to be implemented for improving the data quality through a reduction in survey length. Many papers have discussed implementing a planned missing data study using a split questionnaire design in the cross-sectional setting, but development of these designs in a longitudinal study has been limited. Using simulations and data from the Health and Retirement Study (HRS), we compare the performance of several methods for administering a split questionnaire design in the longitudinal setting. The results suggest that the optimal d...

4 the Kiss Principle in Survey Design : Question Length and Data Quality

2016

Writings on the optimal length for survey questions are characterized by a variety of perspectives and very little empirical evidence. Where evidence exists, support seems to favor lengthy questions in some cases and shorter ones in others. However, on the basis of theories of the survey response process, the use of an excessive number of words may get in the way of the respondent’s comprehension of the information requested, and because of the cognitive burden of longer questions, there may be increased measurement errors. Results are reported from a study of reliability estimates for 426 (exactly replicated) survey questions in face-to-face interviews in six large-scale panel surveys conducted by the University of Michigan’s Survey Research Center. The findings suggest that, at least with respect to some types of survey questions, there are declining levels of reliability for questions with greater numbers of words and provide further support for the advice given to survey researc...

Chapter 19 Statistical analysis of survey data

The fact that survey data are obtained from units selected with complex sample designs needs to be taken into account in the survey analysis: weights need to be used in analyzing survey data and variances of survey estimates need to be computed in a manner that reflects the complex sample design. This chapter outlines the development of weights and their use in computing survey estimates and provides a general discussion of variance estimation for survey data. It deals first with what are termed "descriptive" estimates, such as the totals, means, and proportions that are widely used in survey reports. It then discusses three forms of "analytic" uses of survey data that can be used to examine relationships between survey variables, namely multiple linear regression models, logistic regression models and multi-level models. These models form a set of valuable tools for analyzing the relationships between a key response variable and a number of other factors. In this chapter we give examples to illustrate the use of these modeling techniques and also provide guidance on the interpretation of the results.