Kim Chantala - Academia.edu (original) (raw)
Papers by Kim Chantala
Fertility and Sterility, 2015
The Add Health Study is a nationally representative, probability-based survey of adolescents in g... more The Add Health Study is a nationally representative, probability-based survey of adolescents in grades 7 through 12 conducted between 1994 and 1996. The sample design used to collect the data has introduced a complexity to analysis. Failing to account for this complexity may result in biased parameter estimates and incorrect variance estimates. Hence, you must correct for design effects and unequal probability of selection to ensure that your results are nationally representative with unbiased estimates. Specialized, “user-friendly” statistical software is now available for analyzing data from complex surveys. SUDAAN and STATA are two examples of this type of software. Using both SUDAAN and STATA, we show you how to incorporate characteristics of the sample design into an analysis so that your estimates and standard errors are unbiased. We will first present a simplified description of the Add Health sampling process including a description of the sample attributes and data elements...
The Add Health Study is a nationally representative, probability-based survey of adolescents in g... more The Add Health Study is a nationally representative, probability-based survey of adolescents in grades 7 through 12 conducted between 1994 and 1996. The sample design used to collect the data has introduced a complexity to analysis. Failing to account for this complexity may result in biased parameter estimates and incorrect variance estimates. Hence, you must correct for design effects and unequal probability of selection to ensure that your results are nationally representative with unbiased estimates. Specialized, “user-friendly” statistical software is now available for analyzing data from complex surveys. SUDAAN and STATA are two examples of this type of software. Using both SUDAAN and STATA, we show you how to incorporate characteristics of the sample design into an analysis so that your estimates and standard errors are unbiased. We will first present a simplified description of the Add Health sampling process including a description of the sample attributes and data elements...
Research in estimating multilevel models (MLM) from complex survey data is quite recent. Not only... more Research in estimating multilevel models (MLM) from complex survey data is quite recent. Not only has this research resulted in several popular software packages incorporating sampling weights into estimating MLM, but it has emphasized an important point often overlooked by both analysts and providers of the survey data: the sampling weights used for multilevel analysis need to be constructed differently than the sampling weights used for single-level analysis. Both the distributors of data and the developers of MLM software packages often leave the user responsible for proper scaling of MLM sampling weights. In addition, the method of scaling can be different for the various MLM software packages. To address this problem, we created Stata and SAS programs for constructing sampling weights for estimating two-level models that can be used with several popular multilevel software packages (i.e. gllamm (Stata), LISREL, MLwiN, Mplus). These programs can be downloaded from our website: h...
Background: Human papillomavirus (HPV) vaccine is ent to ation uary or he at de l canc her p V va... more Background: Human papillomavirus (HPV) vaccine is ent to ation uary or he at de l canc her p V vac o rece it wa the s y of d vacc cance ing he
Investigating certain research questions may involve linking the responses of one adolescent to t... more Investigating certain research questions may involve linking the responses of one adolescent to those of another adolescent interviewed in the Add Health Survey. The goal of the analysis will be to characterize the behavior of a pair of adolescents rather than the behavior of an individual adolescent. Unfortunately, the available sampling weights are appropriate only when the analysis seeks to investigate behaviors of individual adolescents. However, an appropriate sampling weight can be constructed for each pair of adolescents and used to correct for clustering and unequal probability of selection of the pair. This paper covers the types of pairs of adolescents that can have a pair weight constructed, describes the necessary formulas and data, and concludes with an example showing how to compute pair weights for romantic partners from the Wave II In-home Interview. Types of Linked Pairs of Adolescents The most common types of linked pairs in the Add Health data are: · Friends: Resp...
The pattern of blood lead during pregnancy was investigated in a cohort of 195 women who, between... more The pattern of blood lead during pregnancy was investigated in a cohort of 195 women who, between October
The authors test the idea that patterns of masculinity-femininity (MF) help sort adolescents into... more The authors test the idea that patterns of masculinity-femininity (MF) help sort adolescents into romantic couples. Using a nationally representative sample of adolescents in Grades 7 to 12 from a probability sample of secondary schools in the United States, an MF measure was constructed by selecting a set of ques-tionnaire items demonstrating sex differences. For each respon-dent, the probability of being a boy was predicted. Respondents identified opposite-sex romantic partners within their school. When the partner identified also was interviewed, the authors were able to create MF for both members of the couple. Trichotomizing MF scores for each sex, it was determined that couples with a very masculine boy and very feminine girl are most likely to have sex, and to have sex the soonest. The couples for which both members are in the average MF range for their sex are the quickest to break up. The pattern of MF is a strong influence on the behavior of adolescent romantic couples.
Most surveys collect data using complex sampling plans that involve selecting both clusters and i... more Most surveys collect data using complex sampling plans that involve selecting both clusters and individuals with unequal probability of selection. Research in using multilevel modeling techniques to analyze such data is relatively new. Often sampling weights based on probabilities of selecting individuals are used to estimate population-based models. However, sampling weights used for estimating multilevel models (MLM) need to be constructed differently than weights used for population-average models. This paper compares the capabilities of MLWIN, MPLUS, LISREL, PROC MIXED (SAS), and gllamm (Stata) for estimating MLM using data collected with a complex sampling plan. We illustrate how sampling weights need to be constructed for estimating MLM with these software packages. Finally, we contrast the results from these packages using data collected with a complex sampling plan.
Most surveys collect data using complex sampling plans involving selection of both clusters and i... more Most surveys collect data using complex sampling plans involving selection of both clusters and individuals with unequal probability of selection. Research in methods of using multilevel modeling (MLM) procedures to analyze such data is relatively new. Often sampling weights based on selection probabilities of individuals are used to estimate population-based models. However, sampling weights used for estimating multilevel models need to be constructed differently than weights used for single-level (population-average) models. This paper compares the capabilities of MLwiN, Mplus, LISREL, PROC MIXED (SAS), and gllamm (Stata) for estimating MLM from data collected with a complex sampling plan. We illustrate how sampling weights for estimating multilevel models with these software packages can be constructed from population average weights. Finally, we use data from the National Longitudinal Survey of Adolescent Health to contrast the results from these packages.
of Child Health and Human Development with cooperative funding from 23 other federal agencies and... more of Child Health and Human Development with cooperative funding from 23 other federal agencies and foundations. Further information may be obtained by contacting addhealth@unc.edu. The Add Health Study is a nationally representative, probability-based survey of adolescents in grades 7 through 12 conducted between 1994 and 1996. The sample design used to collect the data has introduced a complexity to analysis. Failing to account for this complexity may result in biased parameter estimates and incorrect variance estimates. Hence, you must correct for design effects and unequal probability of selection to ensure that your results are nationally representative with unbiased estimates. Specialized, “user-friendly ” statistical software is now available for analyzing data from complex surveys. SUDAAN and STATA are two examples of this type of software. Using both SUDAAN and STATA, we show you how to incorporate characteristics of the sample design into an analysis so that your estimates a...
Combining survey data with alternative data sources (e.g., wearable technology, apps, physiologic... more Combining survey data with alternative data sources (e.g., wearable technology, apps, physiological, ecological monitoring, genomic, neurocognitive assessments, brain imaging, and psychophysical data) to paint a complete biobehavioral picture of trauma patients comes with many complex system challenges and solutions. Starting in emergency departments and incorporating these diverse, broad, and separate data streams presents technical, operational, and logistical challenges but allows for a greater scientific understanding of the long-term effects of trauma. Our manuscript describes incorporating and prospectively linking these multi-dimensional big data elements into a clinical, observational study at US emergency departments with the goal to understand, prevent, and predict adverse posttraumatic neuropsychiatric sequelae (APNS) that affects over 40 million Americans annually. We outline key data-driven system challenges and solutions and investigate eligibility considerations, comp...
Non-response is a potential threat to the accuracy of estimates obtained from sample surveys and ... more Non-response is a potential threat to the accuracy of estimates obtained from sample surveys and can be particularly difficult to avoid in longitudinal studies. The purpose of this report is to investigate non-response in Wave III of Add Health and its influence on study results. Non-response in earlier waves of Add Health has been investigated by the Survey Research Unit at the University of North Carolina. Findings showed that total bias for 13 measures of health and risk behaviors rarely exceed 1% in either Wave I or Wave II, which is small relative to the 20% to 80% prevalence rates for most of these measures.
Most surveys collect data using complex sampling plans involving selection of both clusters and i... more Most surveys collect data using complex sampling plans involving selection of both clusters and individuals with unequal probability of selection. Research in methods of using multilevel modeling (MLM) procedures to analyze such data is relatively new. Often sampling weights based on selection probabilities of individuals are used to estimate population-based models. However, sampling weights used for estimating multilevel models need to be constructed differently than weights used for single-level (population-average) models. This paper compares the capabilities of MLwiN, Mplus, LISREL, PROC MIXED (SAS), and gllamm (Stata) for estimating MLM from data collected with a complex sampling plan. We illustrate how sampling weights for estimating multilevel models with these software packages can be constructed from population average weights. Finally, we use data from the National Longitudinal Survey of Adolescent Health to contrast the results from these packages.
Journal of Adolescent Health, 2002
Report on Bias in Wave III …, 2004
Fertility and Sterility, 2015
The Add Health Study is a nationally representative, probability-based survey of adolescents in g... more The Add Health Study is a nationally representative, probability-based survey of adolescents in grades 7 through 12 conducted between 1994 and 1996. The sample design used to collect the data has introduced a complexity to analysis. Failing to account for this complexity may result in biased parameter estimates and incorrect variance estimates. Hence, you must correct for design effects and unequal probability of selection to ensure that your results are nationally representative with unbiased estimates. Specialized, “user-friendly” statistical software is now available for analyzing data from complex surveys. SUDAAN and STATA are two examples of this type of software. Using both SUDAAN and STATA, we show you how to incorporate characteristics of the sample design into an analysis so that your estimates and standard errors are unbiased. We will first present a simplified description of the Add Health sampling process including a description of the sample attributes and data elements...
The Add Health Study is a nationally representative, probability-based survey of adolescents in g... more The Add Health Study is a nationally representative, probability-based survey of adolescents in grades 7 through 12 conducted between 1994 and 1996. The sample design used to collect the data has introduced a complexity to analysis. Failing to account for this complexity may result in biased parameter estimates and incorrect variance estimates. Hence, you must correct for design effects and unequal probability of selection to ensure that your results are nationally representative with unbiased estimates. Specialized, “user-friendly” statistical software is now available for analyzing data from complex surveys. SUDAAN and STATA are two examples of this type of software. Using both SUDAAN and STATA, we show you how to incorporate characteristics of the sample design into an analysis so that your estimates and standard errors are unbiased. We will first present a simplified description of the Add Health sampling process including a description of the sample attributes and data elements...
Research in estimating multilevel models (MLM) from complex survey data is quite recent. Not only... more Research in estimating multilevel models (MLM) from complex survey data is quite recent. Not only has this research resulted in several popular software packages incorporating sampling weights into estimating MLM, but it has emphasized an important point often overlooked by both analysts and providers of the survey data: the sampling weights used for multilevel analysis need to be constructed differently than the sampling weights used for single-level analysis. Both the distributors of data and the developers of MLM software packages often leave the user responsible for proper scaling of MLM sampling weights. In addition, the method of scaling can be different for the various MLM software packages. To address this problem, we created Stata and SAS programs for constructing sampling weights for estimating two-level models that can be used with several popular multilevel software packages (i.e. gllamm (Stata), LISREL, MLwiN, Mplus). These programs can be downloaded from our website: h...
Background: Human papillomavirus (HPV) vaccine is ent to ation uary or he at de l canc her p V va... more Background: Human papillomavirus (HPV) vaccine is ent to ation uary or he at de l canc her p V vac o rece it wa the s y of d vacc cance ing he
Investigating certain research questions may involve linking the responses of one adolescent to t... more Investigating certain research questions may involve linking the responses of one adolescent to those of another adolescent interviewed in the Add Health Survey. The goal of the analysis will be to characterize the behavior of a pair of adolescents rather than the behavior of an individual adolescent. Unfortunately, the available sampling weights are appropriate only when the analysis seeks to investigate behaviors of individual adolescents. However, an appropriate sampling weight can be constructed for each pair of adolescents and used to correct for clustering and unequal probability of selection of the pair. This paper covers the types of pairs of adolescents that can have a pair weight constructed, describes the necessary formulas and data, and concludes with an example showing how to compute pair weights for romantic partners from the Wave II In-home Interview. Types of Linked Pairs of Adolescents The most common types of linked pairs in the Add Health data are: · Friends: Resp...
The pattern of blood lead during pregnancy was investigated in a cohort of 195 women who, between... more The pattern of blood lead during pregnancy was investigated in a cohort of 195 women who, between October
The authors test the idea that patterns of masculinity-femininity (MF) help sort adolescents into... more The authors test the idea that patterns of masculinity-femininity (MF) help sort adolescents into romantic couples. Using a nationally representative sample of adolescents in Grades 7 to 12 from a probability sample of secondary schools in the United States, an MF measure was constructed by selecting a set of ques-tionnaire items demonstrating sex differences. For each respon-dent, the probability of being a boy was predicted. Respondents identified opposite-sex romantic partners within their school. When the partner identified also was interviewed, the authors were able to create MF for both members of the couple. Trichotomizing MF scores for each sex, it was determined that couples with a very masculine boy and very feminine girl are most likely to have sex, and to have sex the soonest. The couples for which both members are in the average MF range for their sex are the quickest to break up. The pattern of MF is a strong influence on the behavior of adolescent romantic couples.
Most surveys collect data using complex sampling plans that involve selecting both clusters and i... more Most surveys collect data using complex sampling plans that involve selecting both clusters and individuals with unequal probability of selection. Research in using multilevel modeling techniques to analyze such data is relatively new. Often sampling weights based on probabilities of selecting individuals are used to estimate population-based models. However, sampling weights used for estimating multilevel models (MLM) need to be constructed differently than weights used for population-average models. This paper compares the capabilities of MLWIN, MPLUS, LISREL, PROC MIXED (SAS), and gllamm (Stata) for estimating MLM using data collected with a complex sampling plan. We illustrate how sampling weights need to be constructed for estimating MLM with these software packages. Finally, we contrast the results from these packages using data collected with a complex sampling plan.
Most surveys collect data using complex sampling plans involving selection of both clusters and i... more Most surveys collect data using complex sampling plans involving selection of both clusters and individuals with unequal probability of selection. Research in methods of using multilevel modeling (MLM) procedures to analyze such data is relatively new. Often sampling weights based on selection probabilities of individuals are used to estimate population-based models. However, sampling weights used for estimating multilevel models need to be constructed differently than weights used for single-level (population-average) models. This paper compares the capabilities of MLwiN, Mplus, LISREL, PROC MIXED (SAS), and gllamm (Stata) for estimating MLM from data collected with a complex sampling plan. We illustrate how sampling weights for estimating multilevel models with these software packages can be constructed from population average weights. Finally, we use data from the National Longitudinal Survey of Adolescent Health to contrast the results from these packages.
of Child Health and Human Development with cooperative funding from 23 other federal agencies and... more of Child Health and Human Development with cooperative funding from 23 other federal agencies and foundations. Further information may be obtained by contacting addhealth@unc.edu. The Add Health Study is a nationally representative, probability-based survey of adolescents in grades 7 through 12 conducted between 1994 and 1996. The sample design used to collect the data has introduced a complexity to analysis. Failing to account for this complexity may result in biased parameter estimates and incorrect variance estimates. Hence, you must correct for design effects and unequal probability of selection to ensure that your results are nationally representative with unbiased estimates. Specialized, “user-friendly ” statistical software is now available for analyzing data from complex surveys. SUDAAN and STATA are two examples of this type of software. Using both SUDAAN and STATA, we show you how to incorporate characteristics of the sample design into an analysis so that your estimates a...
Combining survey data with alternative data sources (e.g., wearable technology, apps, physiologic... more Combining survey data with alternative data sources (e.g., wearable technology, apps, physiological, ecological monitoring, genomic, neurocognitive assessments, brain imaging, and psychophysical data) to paint a complete biobehavioral picture of trauma patients comes with many complex system challenges and solutions. Starting in emergency departments and incorporating these diverse, broad, and separate data streams presents technical, operational, and logistical challenges but allows for a greater scientific understanding of the long-term effects of trauma. Our manuscript describes incorporating and prospectively linking these multi-dimensional big data elements into a clinical, observational study at US emergency departments with the goal to understand, prevent, and predict adverse posttraumatic neuropsychiatric sequelae (APNS) that affects over 40 million Americans annually. We outline key data-driven system challenges and solutions and investigate eligibility considerations, comp...
Non-response is a potential threat to the accuracy of estimates obtained from sample surveys and ... more Non-response is a potential threat to the accuracy of estimates obtained from sample surveys and can be particularly difficult to avoid in longitudinal studies. The purpose of this report is to investigate non-response in Wave III of Add Health and its influence on study results. Non-response in earlier waves of Add Health has been investigated by the Survey Research Unit at the University of North Carolina. Findings showed that total bias for 13 measures of health and risk behaviors rarely exceed 1% in either Wave I or Wave II, which is small relative to the 20% to 80% prevalence rates for most of these measures.
Most surveys collect data using complex sampling plans involving selection of both clusters and i... more Most surveys collect data using complex sampling plans involving selection of both clusters and individuals with unequal probability of selection. Research in methods of using multilevel modeling (MLM) procedures to analyze such data is relatively new. Often sampling weights based on selection probabilities of individuals are used to estimate population-based models. However, sampling weights used for estimating multilevel models need to be constructed differently than weights used for single-level (population-average) models. This paper compares the capabilities of MLwiN, Mplus, LISREL, PROC MIXED (SAS), and gllamm (Stata) for estimating MLM from data collected with a complex sampling plan. We illustrate how sampling weights for estimating multilevel models with these software packages can be constructed from population average weights. Finally, we use data from the National Longitudinal Survey of Adolescent Health to contrast the results from these packages.
Journal of Adolescent Health, 2002
Report on Bias in Wave III …, 2004