Using Multilevel Models to Model Heterogeneity: Potential and Pitfalls (original) (raw)

Modelling complexity: analysing between-individual and between-place variation -- a multilevel tutorial

Environment and Planning A, 1997

Geography is centrally concerned with difference and heterogeneity, yet much quantitative modelling has been concerned with finding average or general relationships thereby relegating variability to a single catchall 'error' term. Multilevel modelling, in contrast, anticipates complex between-individual and between-place heterogeneity. Previous accounts of the approach have stressed the modelling of higher level, between-place differences, but here the emphasis is placed on the simultaneous consideration of complex variation at all levels. A parade of models is presented each of which considers a particular facet of the model specification. Attention is drawn to the important contrasts between the modelling of categorical and continuous predictors. Illustrative results are provided for variations in British house prices, modelled with the MLn software. An appendix provides an example of the use of this software.

Geographical Variances

Geographical Analysis, 2010

It is sometimes asserted that geographical processes operate at different scaIes [lo, 131. This concept is often dichotomized by terms as "site and situa~on," "strategical and tactical," "local and global." At a more advanced level, the Fourier interpretation of scale is the spatial wavelength and this provides an operational definition for the study of processes in terms of scale, now generally known as spectral analysis. The spectrum is a measure of the proportion of a process which occurs at any one of a large number of scales. A process is thus not asserted to be effective at only one scale unless theoretical or empirical evidence indicates this to be the case. To give a specific example which will be familiar to readers of this journal, a devotee of spectral analysis would assert that quadrat analysis [I, 9, 121 should be undertaken for all possible quadrat sizes, and the results plotted as a function of scale (quadrat size). The resulting curve is the spectrum; if it has distinct peaks then, and only then, are these the appropriate scales at which to examine the process. The foregoing is by way of an introduction since spectral analysis is a well documented field [4,5,19,25], and the literature includes geographical examples [23, 241. Our objective is to suggest, introduce, illustrate, and in-*The authors are indebted to the Inter-University Consortium for Political Research of the University of Michigan for providing the Dutch census data and particularly Jan Verhoef for his timely knowledge and advice about the Netherlands. The necessary computing time and facilities were provided by the Computing Center of the University of Michigan. Professor Bruce Hill of the Department of Statistics of the University of Michigan provided valuable advice. NSF grant GS-1082 to the senior author provided an indispensible background for the present study.

Exploring the variability and geographical patterns of population characteristics: Regional and spatial perspectives

Moravian Geographical Reports

The variability and geographical patterns of population characteristics are key topics in Human Geography. There are many approaches to exploring and quantitatively measuring this issue. Besides standard aspatial statistical methods, there is no universal framework for incorporating regional and spatial aspects into the analysis of areal data. This is mainly because complications, such as the Modifiable Areal Unit Problem or the checkerboard problem, hinder analysis. In this paper, we use two approaches which uniquely combine regional and spatial perspectives of the analysis of variability. This combination brings new insights into the exploration of the variability and geographical patterns of population characteristics. The relationship between regional and spatial approaches is studied with models in a regular grid, using variability decomposition (Theil index) as an example of the regional approach, and spatial autocorrelation (Moran’s I) as an example of the spatial approach. W...

Modelling complexity: Analysing between-individual and between-place variation

1997

Geography is centrally concerned with difference and heterogeneity, yet much quantitative modelling has been concerned with finding average or general relationships thereby relegating variability to a single catchall 'error' term. Multilevel modelling, in contrast, anticipates complex between-individual and between-place heterogeneity. Previous accounts of the approach have stressed the modelling of higher level, between-place differences, but here the emphasis is placed on the simultaneous consideration of complex variation at all levels. A parade of models is presented each of which considers a particular facet of the model specification. Attention is drawn to the important contrasts between the modelling of categorical and continuous predictors. Illustrative results are provided for variations in British house prices, modelled with the MLn software. An appendix provides an example of the use of this software.

ReGIoNAl vARIAtIoN ANd SPAtIAl CoRRelAtIoN

iarc.fr

We used two methods to assess the strength of the regional variation in the age-standardised rates. The first was a method developed by Pennello, Devesa & Gail (1999) based upon a poisson model for the observed number of cases together with a random effect for the regional ...

Multilevel perspectives on modeling census data

Environment and Planning A, 2001

areal coverage in census data, importantly at multiple spatial scales, provides a unique basis to explore the scale contingencies in the nature and degree of geographic variations.

Neighborhood size and local geographic variation of health and social determinants

International journal of health geographics, 2005

BACKGROUND: Spatial filtering using a geographic information system (GIS) is often used to smooth health and ecological data. Smoothing disease data can help us understand local (neighborhood) geographic variation and ecological risk of diseases. Analyses that use small neighborhood sizes yield individualistic patterns and large sizes reveal the global structure of data where local variation is obscured. Therefore, choosing an optimal neighborhood size is important for understanding ecological associations with diseases. This paper uses Hartley's test of homogeneity of variance (Fmax) as a methodological solution for selecting optimal neighborhood sizes. The data from a study area in Vietnam are used to test the suitability of this method. RESULTS: The Hartley's Fmax test was applied to spatial variables for two enteric diseases and two socioeconomic determinants. Various neighbourhood sizes were tested by using a two step process to implement the Fmaxtest. First the varianc...

Under examination: Multilevel models, geography and health research

Progress in Human Geography, 2015

Since the 1990s, multilevel models have become popular tools for looking at contextual effects upon health. However, the way that geography is incorporated into these models has received criticism due to somewhat arbitrary definitions of what counts as context, the models' discrete and, arguably, aspatial view of geographical effects, and the lack of any clear theoretical specification of the processes involved. This review draws together and extends these criticisms, arguing that while currently there are problems with how geography is conceived within multilevel models, there are ways of addressing them, and indeed that it is important to do so.

Bringing the individual back to small-area variation studies: A multilevel analysis of all-cause mortality in Andalusia, Spain

Social Science & Medicine, 2012

We performed a multilevel analysis (including individuals, households, census tracts, municipalities and provinces) on a 10% sample (N ¼ 230,978) from the Longitudinal Database of the Andalusian Population (LDAP). We aimed to investigate place effects on 8-year individual mortality risk. Moreover, besides calculating association (yielding odds ratios, ORs) between area socio-economic circumstances and individual risk, we wanted to estimate variance and clustering using the variance partition coefficient (VPC). We explicitly proclaim the relevance of considering general contextual effects (i.e. the degree to which the context, as a whole, affects individual variance in mortality risk) under at least two circumstances. The first of these concerns the interpretation of specific contextual effects (i.e. the association between a particular area characteristic and individual risk) obtained from multilevel regression analyses. The second involves the interpretation of geographical variance obtained from classic ecological spatial analyses. The so-called "ecological fallacy" apart, the lack of individual-level information renders geographical variance unrelated to the total individual variation and, therefore, difficult to interpret. Finally, we stress the importance of considering the familial household in multilevel analyses. We observed an association between percentage of people with a low educational level in the census tract and individual mortality risk (OR, highest v. lowest quintile ¼ 1.14; 95% confidence interval, CI 1.08e1.20). However, only a minor proportion of the total individual variance in the probability of dying was at the municipality (M) and census tract (CT) levels (VPC M ¼ 0.2% and VPC CT ¼ 0.3%). Conversely, the household (H) level appeared much more relevant (VPC H ¼ 18.6%) than the administrative geographical areas. Without considering general contextual effects, both multilevel analyses of specific contextual effects and ecological studies of small-area variation may provide a misleading picture that overstates the role of administrative areas as contextual determinants of individual differences in mortality.

A brief conceptual tutorial of multilevel analysis in social epidemiology: linking the statistical concept of clustering to the idea of contextual phenomenon

Journal of Epidemiology & Community Health, 2005

This didactical essay is directed to readers disposed to approach multilevel regression analysis (MLRA) in a more conceptual than mathematical way. However, it specifically develops an epidemiological vision on multilevel analysis with particular emphasis on measures of health variation (for example, intraclass correlation). Such measures have been underused in the literature as compared with more traditional measures of association (for example, regression coefficients) in the investigation of contextual determinants of health. A link is provided, which will be comprehensible to epidemiologists, between MLRA and social epidemiological concepts, particularly between the statistical idea of clustering and the concept of contextual phenomenon. Design and participants: The study uses an example based on hypothetical data on systolic blood pressure (SBP) from 25 000 people living in 39 neighbourhoods. As the focus is on the empty MLRA model, the study does not use any independent variable but focuses mainly on SBP variance between people and between neighbourhoods. Results: The intraclass correlation (ICC = 0.08) informed of an appreciable clustering of individual SBP within the neighbourhoods, showing that 8% of the total individual differences in SBP occurred at the neighbourhood level and might be attributable to contextual neighbourhood factors or to the different composition of neighbourhoods. Conclusions: The statistical idea of clustering emerges as appropriate for quantifying ''contextual phenomena'' that is of central relevance in social epidemiology. Both concepts convey that people from the same neighbourhood are more similar to each other than to people from different neighbourhoods with respect to the health outcome variable.