Understanding heterogeneous preferences in random utility models: The use of latent class analysis (original) (raw)

Dissecting preference heterogeneity in consumer stated choice

ABSTRACT: This paper investigates alternative methods to account for preference heterogeneity in choice experiments. The main interest lies in assessing the different results obtainable when investigating heterogeneity in various ways. This comparison can be performed on the basis of model performance and, more interesting, by evaluating willingness to pay measures. Preference heterogeneity analysis relates to the methods used to search for it. Socioeconomic variables can be interacted with attributes and/or alternative-specific constants. Similarly one can consider different subsets of data (strata variables) and estimate a multinomial logit model for each of them. Heterogeneity in preferences can be investigated by including it in the systematic component of utility or in the stochastic one. Mixed logit and latent class models are examples of the first approach. The former, in its random variable specification, allows for random taste variations assuming a specific distribution of the attribute coefficients over the population and permit to capture additional heterogeneity by consenting parameters to vary across individuals both randomly and systematically with observable variables. In other words it accounts for heterogeneity in the mean and in the variance of the distribution of the random parameters due to individual characteristics. Latent class models capture heterogeneity by considering a discrete underlying distribution of tastes. The small number of mass points are the unobserved segments or behavioral groups within which preferences are assumed homogeneous. The probability of membership in a latent class can be additionally made a function of individual characteristics. Alternatively, heterogeneity can be incorporated in terms of the random component of utility. The covariance heterogeneity model adopts the second approach representing a generalization of the nested logit model and can be used to explain heteroscedastic error structures in the data. It allows the inclusive value parameter to be a function of choice alternative attributes and/or individual characteristics. An alternative method refers to an extension of the multinomial logit model in which the integration of unobserved heterogeneity is performed through random error components distributed according to a tree. An interesting improvement in modeling preference heterogeneity is related to its simultaneous inclusion in both systematic and stochastic parts. A valid example is the inclusion of an error component part in a random coefficient specification of the mixed multinomial logit model. The empirical data used for comparing the various methods tested relates to departure airport choice in a multi-airport region. The area of study includes two regions in central Italy, Marche and Emilia-Romagna, and four airports: Ancona, Rimini, Forlì and Bologna. A fractional factorial experimental design was adopted to construct a four alternative choice set and five hypothetical choice exercises in each questionnaire. The selection of the potentially most important attributes and their relative levels was developed on the basis of previous research.

Dissecting preference heterogeneity in consumer stated choices

Transportation Research Part E: Logistics and Transportation Review, 2012

ABSTRACT: This paper investigates alternative methods to account for preference heterogeneity in choice experiments. The main interest lies in assessing the different results obtainable when investigating heterogeneity in various ways. This comparison can be performed on the basis of model performance and, more interesting, by evaluating willingness to pay measures. Preference heterogeneity analysis relates to the methods used to search for it. Socioeconomic variables can be interacted with attributes and/or alternative-specific constants. Similarly one can consider different subsets of data (strata variables) and estimate a multinomial logit model for each of them. Heterogeneity in preferences can be investigated by including it in the systematic component of utility or in the stochastic one. Mixed logit and latent class models are examples of the first approach. The former, in its random variable specification, allows for random taste variations assuming a specific distribution of the attribute coefficients over the population and permit to capture additional heterogeneity by consenting parameters to vary across individuals both randomly and systematically with observable variables. In other words it accounts for heterogeneity in the mean and in the variance of the distribution of the random parameters due to individual characteristics. Latent class models capture heterogeneity by considering a discrete underlying distribution of tastes. The small number of mass points are the unobserved segments or behavioral groups within which preferences are assumed homogeneous. The probability of membership in a latent class can be additionally made a function of individual characteristics. Alternatively, heterogeneity can be incorporated in terms of the random component of utility. The covariance heterogeneity model adopts the second approach representing a generalization of the nested logit model and can be used to explain heteroscedastic error structures in the data. It allows the inclusive value parameter to be a function of choice alternative attributes and/or individual characteristics. An alternative method refers to an extension of the multinomial logit model in which the integration of unobserved heterogeneity is performed through random error components distributed according to a tree. An interesting improvement in modeling preference heterogeneity is related to its simultaneous inclusion in both systematic and stochastic parts. A valid example is the inclusion of an error component part in a random coefficient specification of the mixed multinomial logit model. The empirical data used for comparing the various methods tested relates to departure airport choice in a multi-airport region. The area of study includes two regions in central Italy, Marche and Emilia-Romagna, and four airports: Ancona, Rimini, Forlì and Bologna. A fractional factorial experimental design was adopted to construct a four alternative choice set and five hypothetical choice exercises in each questionnaire. The selection of the potentially most important attributes and their relative levels was developed on the basis of previous research.

Analysing Preference Heterogeneith using Random Parameter Logit and Latent Class Modelling Techniques

2005

Multi-attribute revealed preference data is used to investigate the heterogeneity of tastes in a sample of kayakers, in relation to eleven whitewater sites in Ireland. The paper focuses on a comparison of the analysis of preference heterogeneity using a random parameter logit model and a latent class model. We assess and contrast the evidence for the presence of a finite number of 2, 3, 4 and 5 latent preference groups (classes), and contrast these with the presence of a continuous distribution of parameter estimates using the random parameter logit model. Welfare estimates associated with changes in the attributes of particular whitewater sites are also presented, and are found to vary considerably depending on the approach taken..

Modeling preference heterogeneity in stated choice data: an analysis for public goods generated by agriculture

Agricultural Economics, 2009

Stated choice models based on the random utility framework are becoming increasingly popular in the applied economics literature. The need to account for respondents' preference heterogeneity in such models has motivated researchers in agricultural, environmental, health and transport economics to apply random parameter logit and latent class models. In most of the published literature these models incorporate heterogeneity in preferences through the systematic component of utility. An alternative approach is to investigate heterogeneity through the random component of utility, and covariance heterogeneity models are one means of doing this. In this paper we compare these alternative ways of incorporating preference heterogeneity in stated choice models and evaluate how the selection of approach affects welfare estimates in a given empirical application. We find that a Latent Class approach fits our data best but all the models perform well in terms of out-of-sample predictions. Finally, we discuss what criteria a researcher can use to decide which approach is most appropriate for a given data set.

Advances in Stated Preference Methods: Discrete and continuous Mixing distributions in Logit models for representing variance and taste heterogeneity.

Public policies should reflect and accommodate as much as possible citizens’ preferences and values. However, it is dicult to know the correct value citizens place on public goods as they are not generally exchanged in the marketplace. For this reason, non market valuation is increasingly considered to be an important tool for informing policy decisions. In this context, Contingent Valuation (CV) and Discrete Choice Experiments (DCE) are probably the most widely used tools to analyse Stated preferences and retrieve people’s Willingness to Pay (WTP) for improving a particular environmental amenity. Since its introduction as a technique of non-market valuation, the number of CV studies published in peer reviewed journals consistently increased for more than three decades. Then, since the mid to late 1990s, the CV technique has been largely substituted by the DCE method. Having reviewed the literature, it was decided to focus my Phd on DCE method. Notwithstanding the importance of CV as a useful tool for preference elicitation, DCE proved more appealing, as recent literature have identified a number of exciting research avenues from a methodological and econometric point of view. Therefore, the PhD has been mainly focused on developing new tools for accommodating heterogeneity in tastes, variances and heuristics employing Latent Class (LC) analysis. As a result, I formalised a class of LC models that can accommodate heteroscedasticity and/or heterogeneity (depending on assumptions and parameterisation) within each class as well as di erent heuristics across classes. The three essays collected in this thesis represent the main outcome of this research. More specifically the first paper, on urban land use, introduces class heterogeneity within a LC model. This is obtained by specifying a discrete mixture of sets of continuous distributions. The model is applied to both simulated and real data in order to demonstrate its flexibility and the advantages for policy appraisal. The second paper introduces and formalises the idea of a heteroscedastic LC model using data from a recreational site choice study elicited through stated preference methods to compare various model specifications. Results show that model fit, welfare estimates and choice predictions are sensitive to the manner in which both types of heterogeneity are accommodated for. This is done in WTP-space to directly compare estimates from continuous and discrete mixture representations as well as to demonstrate the importance of including the scale parameter even under this reparameterisation. In the third and final paper, on species preservation, the well known problem of preference and variance instability due to learning and fatigue in DCE is tackled by applying a scale-adjusted latent class model to uncover both types of instability simultaneously and probabilistically across the sample. Findings highlight the advantages, in terms of model fit, interpretation and policy implications, that can be achieved when both types of instability are addressed concurrently. Data collected to estimate the existence value of rare and endangered fish species in Ireland are utilised as an empirical case-study. Finally conclusions and avenues for possible further research and future applications of the presented models are drawn.

Taste indicators and heterogeneous revealed preferences for congestion in recreation demand

2008

Researchers using revealed preference data have mostly relied on the Mixed Logit (ML) framework to model unobserved heterogeneity. In this paper, we suggest an extension of this model where we integrate direct measures of taste and revealed preferences, under a unified econometric setting, to describe heterogeneous preferences for congestion in recreation demand. ML is a random parameter discrete choice model, which decomposes the coefficients of the regression equation into a mean effect shared by all individuals in the sample, and a deviation with respect to this mean, specific to each individual. Within this structure, heterogeneity is summarized using a parametric density function for the coefficients of the model. From this distribution one can identify the portion of people who like or dislike an attribute of the good. On the other hand, taste indicators, represented in a like-dislike scale, constitute complementary information about the distribution of tastes in the population. We combine both sources of information to characterize preferences in our model. The traditional ML suggests almost 60% of people in the sample like crowded places while our integrated model implies almost 100% of the people dislike congestion. These results show the bene…ts of using taste indicators to describe heterogeneous preferences for attributes describing alternatives of a choice set.

Marketing models of consumer heterogeneity

Journal of Econometrics, 1998

The distribution of consumer preferences plays a central role in many marketing activities. Pricing and product design decisions, for example, are based on an understanding of the differences among consumers in price sensitivity and valuation of product attributes. In addition, marketing activities which target specific households require household level parameter estimates. Thus, the modeling of consumer heterogeneity is the central focus of many statistical marketing applications. In contrast, heterogeneity is often regarded as an ancillary nuisance problem in much of the applied econometrics literature which must be dealt with but is not the focus of the investigation. The focus is instead on estimating average effects of policy variables. In this paper, we discuss various approaches to modeling consumer heterogeneity and evaluate the utility of these approaches for marketing applications.

A comparison of generalized multinomial logit and latent class approaches to studying consumer heterogeneity with some extensions of the generalized multinomial logit model

Applied Stochastic Models in Business and Industry, 2011

We calibrate and contrast the recent generalized multinomial logit model and the widely used latent class logit model approaches for studying heterogeneity in consumer purchases. We estimate the parameters of the models on panel data of household ketchup purchases, and find that the generalized multinomial logit model outperforms the best-fitting latent class logit model in terms of the Bayesian information criterion. We compare the posterior estimates of coefficients for individual customers based on the two different models and discuss how the differences could affect marketing strategies (such as pricing), which could be affected by applying each of the models. We also describe extensions to the scale heterogeneity model that includes the effects of state dependence and purchase history.

Modelling preference heterogeneity in stated choice data for environmental goods: a comparison of random parameter, covariance heterogeneity and latent class logit models

2007

Stated choice models based on the random utility framework are becoming increasingly popular in the applied economics literature. The need to account for respondents' preference heterogeneity in such models has motivated researchers in agricultural, environmental, health and transport economics to apply random parameter logit and latent class models. In most of the published literature these models incorporate heterogeneity in preferences through the systematic component of utility. An alternative approach is to investigate heterogeneity through the random component of utility, and covariance heterogeneity models are one means of doing this. In this paper we compare these alternative ways of incorporating preference heterogeneity in stated choice models and evaluate how the selection of approach affects welfare estimates in a given empirical application. We find that a Latent Class approach fits our data best but all the models perform well in terms of out-of-sample predictions. Finally, we discuss what criteria a researcher can use to decide which approach is most appropriate for a given data set.