Estimating Small Area Diabetes Prevalence in the US Using the Behavioral Risk Factor Surveillance System (original) (raw)

Bayesian Small Area Estimates of Diabetes Incidence by United States County, 2009

Journal of Data Science, 2021

In the United States, diabetes is common and costly. Programs to prevent new cases of diabetes are often carried out at the level of the county, a unit of local government. Thus, efficient targeting of such programs requires county-level estimates of diabetes incidence−the fraction of the nondiabetic population who received their diagnosis of diabetes during the past 12 months. Previously, only estimates of prevalence−the overall fraction of population who have the disease−have been available at the county level. Counties with high prevalence might or might not be the same as counties with high incidence, due to spatial variation in mortality and relocation of persons with incident diabetes to another county. Existing methods cannot be used to estimate county-level diabetes incidence, because the fraction of the population who receive a diabetes diagnosis in any year is too small. Here, we extend previously developed methods of Bayesian small-area estimation of prevalence, using diffuse priors, to estimate diabetes incidence for all U.S. counties based on data from a survey designed to yield state-level estimates. We found high incidence in the southeastern United States, the Appalachian region, and in scattered counties throughout the western U.S. Our methods might be applicable in other circumstances in which all cases of a rare condition also must be cases of a more common condition (in this analysis, "newly diagnosed cases of diabetes" and "cases of diabetes"). If appropriate data are available, our methods can be used to estimate proportion of the population with the rare condition at greater geographic specificity than the data source was designed to provide.

Identifying counties vulnerable to diabetes from obesity prevalence in the United States: a spatiotemporal analysis

Geospatial health, 2016

Clinical and epidemiological research has reported a strong association between diabetes and obesity. However, whether increased diabetes prevalence is more likely to appear in areas with increased obesity prevalence has not been thoroughly investigated in the United States (US). The Bayesian structured additive regression model was applied to identify whether counties with higher obesity prevalence are more likely clustered in specific regions in 48 contiguous US states. Prevalence data adopted the small area estimate from the Behavioral Risk Factor Surveillance System. Confounding variables like socioeconomic status adopted data were from the American Community Survey. This study reveals that an increased percentage of relative risk of diabetes was more likely to appear in Southeast, Northeast, Central and South regions. Of counties vulnerable to diabetes, 36.8% had low obesity prevalence, and most of them were located in the Southeast, Central, and South regions. The geographic d...

Disparity of Imputed Data from Small Area Estimate Approaches – A Case Study on Diabetes Prevalence at the County Level in the U.S

Data Science Journal

This paper assesses concordance and inconsistency among three small area estimation methods that are currently providing county-level health indicators in the United States. The three methods are multi-level logistic regression, spatial logistic regression, and spatial Poison regression, all proposed since 2010. Diabetes prevalence is estimated for each county in the continental United States from the 2012 sample of Behavioral Risk Factor Surveillance System. The mapping results show that all three methods displayed elevated diabetes prevalence in the South. While the Pearson correlation coefficients among three model-based estimates were all above 0.60, the highest one was 0.80 between the multilevel and spatial logistic methods. While point estimates are apparently different among the three small area estimate methods, their top and bottom of quintile distributions are fairly consistent based on Bangdiwala's B-statistic, suggesting that outputs from each method would support consistent policy making in terms of identifying top and bottom percent counties.

A multilevel model for cardiovascular disease prevalence in the US and its application to micro area prevalence estimates

International Journal of Health Geographics, 2009

Background: Estimates of disease prevalence for small areas are increasingly required for the allocation of health funds according to local need. Both individual level and geographic risk factors are likely to be relevant to explaining prevalence variations, and in turn relevant to the procedure for small area prevalence estimation. Prevalence estimates are of particular importance for major chronic illnesses such as cardiovascular disease.

Multilevel and urban health modeling of risk factors for diabetes mellitus: a new insight into public health and preventive medicine

Advances in preventive medicine, 2014

This study aimed to apply multidisciplinary analysis approaches and test two hypotheses that (1) there was a significant increase in the prevalence of diabetes mellitus (DM) from 2002 to 2010 in the city of Philadelphia and that (2) there were significant variations in the prevalence of DM across neighborhoods, and these variations were significantly related to the variations in the neighborhood physical and social environment (PSE). Data from the Southeastern Pennsylvania Household Health Surveys in 2002-2004 (period 1, n = 8,567) and in 2008-2010 (period 2, n = 8,747) were analyzed using a cross-sectional comparison approach. An index of neighborhood PSE was constructed from 8 specific measures. The results show that age-adjusted prevalence of DM increased from period 1 (10.20%) to period 2 (11.91%) (P < 0.001). After adjusting age, sex, and survey years, an estimate of 12.14%, 18.33%, and 11.89% of the odds ratios for DM was related to the differences in the neighborhood PSE d...

Bayesian Spatial Modeling of Diabetes and Hypertension: Results from the South Africa General Household Survey

International Journal of Environmental Research and Public Health

Determining spatial links between disease risk and socio-demographic characteristics is vital in disease management and policymaking. However, data are subject to complexities caused by heterogeneity across host classes and space epidemic processes. This study aims to implement a spatially varying coefficient (SVC) model to account for non-stationarity in the effect of covariates. Using the South Africa general household survey, we study the provincial variation of people living with diabetes and hypertension risk through the SVC model. The people living with diabetes and hypertension risk are modeled using a logistic model that includes spatially unstructured and spatially structured random effects. Spatial smoothness priors for the spatially structured component are employed in modeling, namely, a Gaussian Markov random field (GMRF), a second-order random walk (RW2), and a conditional autoregressive (CAR) model. The SVC model is used to relax the stationarity assumption in which n...

Spatial Clusters of County-Level Diagnosed Diabetes and Associated Risk Factors in the United States

The Open Diabetes Journal, 2012

Introduction: We examined whether spatial clusters of county-level diagnosed diabetes prevalence exist in the United States and whether socioeconomic and diabetes risk factors were associated with these clusters. Materials and Methods: We used estimated county-level age-adjusted data on diagnosed diabetes prevalence for adults in 3109 counties in the United States (2007 data). We identified four types of diabetes clusters based on spatial autocorrelations: high-prevalence counties with high-prevalence neighbors (High-High), low-prevalence counties with low-prevalence neighbors (Low-Low), low-prevalence counties with high-prevalence neighbors (Low-High), and highprevalence counties with low-prevalence neighbors (High-Low). We then estimated relative risks for clusters being associated with several socioeconomic and diabetesrisk factors. Results: Diabetes prevalence in 1551 counties was spatially associated (p<0.05) with prevalence in neighboring counties. The rate of obesity, physical inactivity, poverty, and the proportion of non-Hispanic blacks were associated with a county being in a High-High cluster versus being a non-cluster county (7% to 36% greater risk) or in a Low-Low cluster (13% to 67% greater risk). The percentage of non-Hispanic blacks was associated with a 7% greater risk for being in a Low-High cluster. The rate of physical inactivity and the percentage of Hispanics or non-Hispanic American Indians were associated with being in a High-Low cluster (5% to 21% greater risk). Discussion: Distinct spatial clusters of diabetes prevalence exist in the United States. Strong association between diabetes clusters and socioeconomic and other diabetes risk factors suggests that interventions might be tailored according to the prevalence of modifiable factors in specific counties.

Modeling Community Health with Areal Data: Bayesian Inference with Survey Standard Errors and Spatial Structure

International Journal of Environmental Research and Public Health, 2021

Epidemiologists and health geographers routinely use small-area survey estimates as covariates to model areal and even individual health outcomes. American Community Survey (ACS) estimates are accompanied by standard errors (SEs), but it is not yet standard practice to use them for evaluating or modeling data reliability. ACS SEs vary systematically across regions, neighborhoods, socioeconomic characteristics, and variables. Failure to consider probable observational error may have substantial impact on the large bodies of literature relying on small-area estimates, including inferential biases and over-confidence in results. The issue is particularly salient for predictive models employed to prioritize communities for service provision or funding allocation. Leveraging the tenets of plausible reasoning and Bayes’ theorem, we propose a conceptual framework and workflow for spatial data analysis with areal survey data, including visual diagnostics and model specifications. To illustr...

Diabetes prevalence is associated with different community factors in the diabetes belt versus the rest of the United States

Obesity (Silver Spring, Md.), 2017

To investigate differences in community characteristics associated with diabetes prevalence between the Diabetes Belt and the rest of the contiguous United States (U.S.) METHODS: County-level adult diabetes prevalence estimates (i.e., percent of people [≥20 years] with diagnosed diabetes 2009) were used from the Centers for Disease Control and Prevention, in addition to data from the U.S. Census Bureau, U.S. Department of Agriculture, and U.S. Department of Health and Human Services, to carry out a spatial regime analysis to identify county-level factors correlated with diabetes prevalence in the Diabetes Belt versus the remainder of the U.S. Counties outside of the Diabetes Belt demonstrated stronger positive associations between diabetes prevalence and persistent poverty and greater percentages of unemployed labor forces. For counties in the Diabetes Belt, diabetes prevalence showed a stronger positive association with natural amenities (e.g., temperate climate and topographic fea...

Using Multiple Sources of Data to Assess the Prevalence of Diabetes at the Subcounty Level, Duval County, Florida, 2007

2010

Diabetes rates continue to grow in the United States. Effectively addressing the epidemic requires better understanding of the distribution of disease and the geographic clustering of factors that influence it. Variations in the prevalence of diabetes at the local level are largely unreported, making understanding the disparities associated with the disease more difficult. Diabetes death rates during the past 15 years in Duval County, Florida, have been disproportionately high compared with the rest of the state.