Michael Lechner SEW - Academia.edu (original) (raw)
Papers by Michael Lechner SEW
Swiss Journal of Economics and Statistics
In recent years, microeconometrics experienced the ‘credibility revolution’, culminating in the 2... more In recent years, microeconometrics experienced the ‘credibility revolution’, culminating in the 2021 Nobel prices for David Card, Josh Angrist, and Guido Imbens. This ‘revolution’ in how to do empirical work led to more reliable empirical knowledge of the causal effects of certain public policies. In parallel, computer science, and to some extent also statistics, developed powerful (so-called Machine Learning) algorithms that are very successful in prediction tasks. The new literature on Causal Machine Learning unites these developments by using algorithms originating in Machine Learning for improved causal analysis. In this non-technical overview, I review some of these approaches. Subsequently, I use an empirical example from the field of active labour market programme evaluation to showcase how Causal Machine Learning can be applied to improve the usefulness of such studies. I conclude with some considerations about shortcomings and possible future developments of these methods a...
Entropy
There is great demand for inferring causal effect heterogeneity and for open-source statistical s... more There is great demand for inferring causal effect heterogeneity and for open-source statistical software, which is readily available for practitioners. The mcf package is an open-source Python package that implements Modified Causal Forest (mcf), a causal machine learner. We replicate three well-known studies in the fields of epidemiology, medicine, and labor economics to demonstrate that our mcf package produces aggregate treatment effects, which align with previous results, and in addition, provides novel insights on causal effect heterogeneity. For all resolutions of treatment effects estimation, which can be identified, the mcf package provides inference. We conclude that the mcf constitutes a practical and extensive tool for a modern causal heterogeneous effects analysis.
, www.glue.umd.edu/\~jsmithz * Financial support from the Swiss National Science Foundation (NFP 1... more , www.glue.umd.edu/~jsmithz * Financial support from the Swiss National Science Foundation (NFP 12-53735.18) is gratefully acknowledged by Michael Lechner. Financial support from the Social Science and Humanities Research Council of Canada and from the CIBC Chair in Human Capital and Productivity at the University of Western Ontario is gratefully acknowledged by Jeffrey Smith. The data are a subsample from a database generated for the evaluation of the Swiss active labour market policy together with Michael Gerfin. We are grateful to the Department of Economics of the Swiss Government (seco; Arbeitsmarktstatistik) for providing the data and to Michael Gerfin for his help in preparing them. We thank David Margolis for helpful comments.
Research Papers in Economics, Oct 26, 2021
We predict the probabilities for a draw, a home win, and an away win, for the games of the German... more We predict the probabilities for a draw, a home win, and an away win, for the games of the German Football Bundesliga (BL1) with a new machine-learning estimator using the (large) information available up to that date. We use these individual predictions in order to simulate a league table for every game day until the end of the season. This combination of a (stochastic) simulation approach with machine learning allows us to come up with statements about the likelihood that a particular team is reaching specific places in the final league table (i.e. champion, relegation, etc.). The machine-learning algorithm used, builds on a recent development of an Ordered Random Forest. This estimator generalises common estimators like ordered probit or ordered logit maximum likelihood and is able to recover essentially the same output as the standard estimators, such as the probabilities of the alternative conditional on covariates. The approach is already in use and results for the current season can be found at www.sew.unisg.ch/soccer_analytics.
for helpful comments and suggestions. The usual disclaimer applies.
The order of actions in contests may have a significant effect on performance. In this study we e... more The order of actions in contests may have a significant effect on performance. In this study we examine the role of schedule in round-robin tournaments with sequential games between three and four contestants. Our propensity-score matching estimation, based on soccer FIFA World Cups, UEFA European Championships and Olympic wrestling events, reveals that there is a substantial advantage to the contestant who competes in the first and third matches, which is in line with game-theoretical predictions. Our finding implies that the round-robin structure with sequential games is endogenously unfair, since it systematically favours one of the contestants.
arXiv: Econometrics, 2019
This paper considers the practically important case of nonparametrically estimating heterogeneous... more This paper considers the practically important case of nonparametrically estimating heterogeneous average treatment effects that vary with a limited number of discrete and continuous covariates in a selection-on-observables framework where the number of possible confounders is very large. We propose a two-step estimator for which the first step is estimated by machine learning. We show that this estimator has desirable statistical properties like consistency, asymptotic normality and rate double robustness. In particular, we derive the coupled convergence conditions between the nonparametric and the machine learning steps. We also show that estimating population average treatment effects by averaging the estimated heterogeneous effects is semi-parametrically efficient. The new estimator is an empirical example of the effects of mothers' smoking during pregnancy on the resulting birth weight.
In this note, we show that the OLS and fixed-effects (FE) estimators of the popular differ-ence-i... more In this note, we show that the OLS and fixed-effects (FE) estimators of the popular differ-ence-in-differences model may deviate when there is time varying panel non-response. If such non-response does not affect the common-trend assumption, then OLS and FE are consistent, but OLS is more precise. However, if non-response is affecting the common-trend assumption, then FE estimation may still be consistent, while OLS will be inconsistent. We provide simulation as well as empirical evidence for this phenomenon to occur. We conclude that in case of unbalanced panels, any evidence of deviating OLS and FE estimates should be considered as evidence that non-response is not ignorable for the differences-in-differences estimation.
This paper models payment evasion as a source of profit by letting the firm choose the price char... more This paper models payment evasion as a source of profit by letting the firm choose the price charged to paying consumers and the fine collected from detected payment evaders. The consumers choose whether to purchase, evade payment, or refrain from consumption. We show that payment evasion allows the firm to charge a higher price to paying consumers and to generate a higher profit. We also show that higher fines do not necessarily reduce payment evasion. Finally, we provide empirical evidence which is consistent with our theoretical analysis, using comprehensive micro data on fare dodging on the Zurich Transport Network.
Switzerland is a multi-lingual developed country that provides an attractive stage to test ingrou... more Switzerland is a multi-lingual developed country that provides an attractive stage to test ingroup favoritism that is driven by linguistic differences. To that end, we utilize data from soccer games in the top two Swiss divisions between the seasons 2005/06 and 2017/18. In these games, the referee was from the same linguistic area with one team, whereas the other team was from a different linguistic area. Using very rich data on teams’ and games’ characteristics, our causal forest-based estimator reveals that referees assign significantly more penalties in the form of yellow and red cards to teams from a different linguistic area. This form of ingroup favoritism is large enough so that it is likely to affect the outcome of the game. As evidence, we find that the difference in points in favor of the home team increases significantly when a referee is from the same linguistic area.
The home advantage phenomenon is a well-established feature in sports competitions. In this paper... more The home advantage phenomenon is a well-established feature in sports competitions. In this paper, we examine data from 1,908 soccer matches played in the German Bundesliga during the seasons from 2007-08 to 2015-16. Using a very rich data set, our econometric analysis that is based on matching methods reveals that the usual home advantage disappears when the game is in the middle of the week instead of being on the weekend. Our results indicate that, since the midweek matches are unevenly allocated among teams, the actual schedules of the Bundesliga favour teams with fewer home games in midweek. The paper also shows that these soccer-specific findings have some implications for the design of contests in general.
CEPR: Macroeconomics & Growth (Topic), 2020
We reassess the effects of natural resources on economic development and conflict, applying a cau... more We reassess the effects of natural resources on economic development and conflict, applying a causal forest estimator and data from 3,800 Sub-Saharan African districts. We find that, on average, mining activities and higher world market prices of locally mined minerals both increase economic development and conflict. Consistent with the previous literature, mining activities have more positive effects on economic development and weaker effects on conflict in places with low ethnic diversity and high institutional quality. In contrast, the effects of changes in mineral prices vary little in ethnic diversity and institutional quality, but are non-linear and largest at relatively high prices.
The Journal of Industrial Economics, 2017
This paper models payment evasion as a source of profit by letting the firm choose the price char... more This paper models payment evasion as a source of profit by letting the firm choose the price charged to paying consumers and the fine collected from detected payment evaders. The consumers choose whether to purchase, evade payment, or refrain from consumption. We show that payment evasion allows the firm to charge a higher price to paying consumers and to generate a higher profit. We also show that higher fines do not necessarily reduce payment evasion. Finally, we provide empirical evidence which is consistent with our theoretical analysis, using comprehensive micro data on fare dodging on the Zurich Transport Network.
AEA Randomized Controlled Trials, 2017
What is the role of physical activity in the process of human capital accumu-lation? Brain resear... more What is the role of physical activity in the process of human capital accumu-lation? Brain research provides growing evidence of the importance of physical activity for various aspects of cognitive functions. An increasingly sedentary lifestyle could thus be not only harmful to population health, but also disrupt human capital accumulation. This paper analyzes the effects of on-campus recreational sports and exercise on educational outcomes of university students. To identify causal effects, we randomize financial incentives to encourage students' participation in on-campus sports and exercise. The incentives increased participation frequency by 0.26 times per week (47%) and improved grades by 0.14 standard deviations. This effect is primarily driven by male students and students at higher quantiles of the grade distribution. Results from survey data suggest that students substitute off-campus with on-campus physical activities during the day but do not significantly increase the overall frequency. Our findings suggest that students spend more time on campus and are better able to integrate studying and exercising, which may enhance the effectiveness of studying and thus improve student performance.
SSRN Electronic Journal, 2014
Journal of Environmental Economics and Management, 2020
The disclosure of the VW emission manipulation scandal caused a quasi-experimental market shock i... more The disclosure of the VW emission manipulation scandal caused a quasi-experimental market shock in the observable quality of VW diesel vehicles. We consider a classical model for adverse selection and sorting to derive an empirically testable hypothesis about the impact of observable quality on the supply of used cars. We test the hypothesis with data collected from an online car selling platform which reflects about 50% of the German used-car market. The empirical approach is based on a conditional difference-indifferences method. We find that the supply of used VW diesel vehicles increases after the VW emission scandal. This finding is consistent with the predictions of the theoretical model. Furthermore, we find the positive supply effects increase with the probability of manipulation.
SSRN Electronic Journal, 2005
This paper examines the empirical analysis of treatment effects on duration outcomes from data th... more This paper examines the empirical analysis of treatment effects on duration outcomes from data that contain instrumental variation. We focus on social experiments in which an intention to treat is randomized and compliance may be imperfect. We distinguish between cases where the treatment starts at the moment of randomization and cases where it starts at a later point in time. We derive exclusion restrictions under various informational and behavioral assumptions and we analyze identifiability under these restrictions. It turns out that randomization (and by implication, instrumental variation) by itself is often insufficient for inference on interesting effects, and needs to be augmented by a semi-parametric structure. We develop corresponding non-and semi-parametric tests and estimation methods.
We investigate heterogenous employment effects of Flemish training programmes. Based on administr... more We investigate heterogenous employment effects of Flemish training programmes. Based on administrative individual data, we analyse programme effects at various aggregation levels using Modified Causal Forests (MCF), a causal machine learning estimator for multiple programmes. While all programmes have positive effects after the lock-in period, we find substantial heterogeneity across programmes and types of unemployed. Simulations show that assigning unemployed to programmes that maximise individual gains as identified in our estimation can considerably improve effectiveness. Simplified rules, such as one giving priority to unemployed with low employability, mostly recent migrants, lead to about half of the gains obtained by more sophisticated rules.
Uncovering the heterogeneity of causal effects of policies and business decisions at various leve... more Uncovering the heterogeneity of causal effects of policies and business decisions at various levels of granularity provides substantial value to decision makers. This paper develops new estimation and inference procedures for multiple treatment models in a selection-on-observables framework by modifying the Causal Forest approach suggested by Wager and Athey (2018) in several dimensions. The new estimators have desirable theoretical, computational and practical properties for various aggregation levels of the causal effects. While an Empirical Monte Carlo study suggests that they outperform previously suggested estimators, an application to the evaluation of an active labour market programme shows the value of the new methods for applied research.
Swiss Journal of Economics and Statistics
In recent years, microeconometrics experienced the ‘credibility revolution’, culminating in the 2... more In recent years, microeconometrics experienced the ‘credibility revolution’, culminating in the 2021 Nobel prices for David Card, Josh Angrist, and Guido Imbens. This ‘revolution’ in how to do empirical work led to more reliable empirical knowledge of the causal effects of certain public policies. In parallel, computer science, and to some extent also statistics, developed powerful (so-called Machine Learning) algorithms that are very successful in prediction tasks. The new literature on Causal Machine Learning unites these developments by using algorithms originating in Machine Learning for improved causal analysis. In this non-technical overview, I review some of these approaches. Subsequently, I use an empirical example from the field of active labour market programme evaluation to showcase how Causal Machine Learning can be applied to improve the usefulness of such studies. I conclude with some considerations about shortcomings and possible future developments of these methods a...
Entropy
There is great demand for inferring causal effect heterogeneity and for open-source statistical s... more There is great demand for inferring causal effect heterogeneity and for open-source statistical software, which is readily available for practitioners. The mcf package is an open-source Python package that implements Modified Causal Forest (mcf), a causal machine learner. We replicate three well-known studies in the fields of epidemiology, medicine, and labor economics to demonstrate that our mcf package produces aggregate treatment effects, which align with previous results, and in addition, provides novel insights on causal effect heterogeneity. For all resolutions of treatment effects estimation, which can be identified, the mcf package provides inference. We conclude that the mcf constitutes a practical and extensive tool for a modern causal heterogeneous effects analysis.
, www.glue.umd.edu/\~jsmithz * Financial support from the Swiss National Science Foundation (NFP 1... more , www.glue.umd.edu/~jsmithz * Financial support from the Swiss National Science Foundation (NFP 12-53735.18) is gratefully acknowledged by Michael Lechner. Financial support from the Social Science and Humanities Research Council of Canada and from the CIBC Chair in Human Capital and Productivity at the University of Western Ontario is gratefully acknowledged by Jeffrey Smith. The data are a subsample from a database generated for the evaluation of the Swiss active labour market policy together with Michael Gerfin. We are grateful to the Department of Economics of the Swiss Government (seco; Arbeitsmarktstatistik) for providing the data and to Michael Gerfin for his help in preparing them. We thank David Margolis for helpful comments.
Research Papers in Economics, Oct 26, 2021
We predict the probabilities for a draw, a home win, and an away win, for the games of the German... more We predict the probabilities for a draw, a home win, and an away win, for the games of the German Football Bundesliga (BL1) with a new machine-learning estimator using the (large) information available up to that date. We use these individual predictions in order to simulate a league table for every game day until the end of the season. This combination of a (stochastic) simulation approach with machine learning allows us to come up with statements about the likelihood that a particular team is reaching specific places in the final league table (i.e. champion, relegation, etc.). The machine-learning algorithm used, builds on a recent development of an Ordered Random Forest. This estimator generalises common estimators like ordered probit or ordered logit maximum likelihood and is able to recover essentially the same output as the standard estimators, such as the probabilities of the alternative conditional on covariates. The approach is already in use and results for the current season can be found at www.sew.unisg.ch/soccer_analytics.
for helpful comments and suggestions. The usual disclaimer applies.
The order of actions in contests may have a significant effect on performance. In this study we e... more The order of actions in contests may have a significant effect on performance. In this study we examine the role of schedule in round-robin tournaments with sequential games between three and four contestants. Our propensity-score matching estimation, based on soccer FIFA World Cups, UEFA European Championships and Olympic wrestling events, reveals that there is a substantial advantage to the contestant who competes in the first and third matches, which is in line with game-theoretical predictions. Our finding implies that the round-robin structure with sequential games is endogenously unfair, since it systematically favours one of the contestants.
arXiv: Econometrics, 2019
This paper considers the practically important case of nonparametrically estimating heterogeneous... more This paper considers the practically important case of nonparametrically estimating heterogeneous average treatment effects that vary with a limited number of discrete and continuous covariates in a selection-on-observables framework where the number of possible confounders is very large. We propose a two-step estimator for which the first step is estimated by machine learning. We show that this estimator has desirable statistical properties like consistency, asymptotic normality and rate double robustness. In particular, we derive the coupled convergence conditions between the nonparametric and the machine learning steps. We also show that estimating population average treatment effects by averaging the estimated heterogeneous effects is semi-parametrically efficient. The new estimator is an empirical example of the effects of mothers' smoking during pregnancy on the resulting birth weight.
In this note, we show that the OLS and fixed-effects (FE) estimators of the popular differ-ence-i... more In this note, we show that the OLS and fixed-effects (FE) estimators of the popular differ-ence-in-differences model may deviate when there is time varying panel non-response. If such non-response does not affect the common-trend assumption, then OLS and FE are consistent, but OLS is more precise. However, if non-response is affecting the common-trend assumption, then FE estimation may still be consistent, while OLS will be inconsistent. We provide simulation as well as empirical evidence for this phenomenon to occur. We conclude that in case of unbalanced panels, any evidence of deviating OLS and FE estimates should be considered as evidence that non-response is not ignorable for the differences-in-differences estimation.
This paper models payment evasion as a source of profit by letting the firm choose the price char... more This paper models payment evasion as a source of profit by letting the firm choose the price charged to paying consumers and the fine collected from detected payment evaders. The consumers choose whether to purchase, evade payment, or refrain from consumption. We show that payment evasion allows the firm to charge a higher price to paying consumers and to generate a higher profit. We also show that higher fines do not necessarily reduce payment evasion. Finally, we provide empirical evidence which is consistent with our theoretical analysis, using comprehensive micro data on fare dodging on the Zurich Transport Network.
Switzerland is a multi-lingual developed country that provides an attractive stage to test ingrou... more Switzerland is a multi-lingual developed country that provides an attractive stage to test ingroup favoritism that is driven by linguistic differences. To that end, we utilize data from soccer games in the top two Swiss divisions between the seasons 2005/06 and 2017/18. In these games, the referee was from the same linguistic area with one team, whereas the other team was from a different linguistic area. Using very rich data on teams’ and games’ characteristics, our causal forest-based estimator reveals that referees assign significantly more penalties in the form of yellow and red cards to teams from a different linguistic area. This form of ingroup favoritism is large enough so that it is likely to affect the outcome of the game. As evidence, we find that the difference in points in favor of the home team increases significantly when a referee is from the same linguistic area.
The home advantage phenomenon is a well-established feature in sports competitions. In this paper... more The home advantage phenomenon is a well-established feature in sports competitions. In this paper, we examine data from 1,908 soccer matches played in the German Bundesliga during the seasons from 2007-08 to 2015-16. Using a very rich data set, our econometric analysis that is based on matching methods reveals that the usual home advantage disappears when the game is in the middle of the week instead of being on the weekend. Our results indicate that, since the midweek matches are unevenly allocated among teams, the actual schedules of the Bundesliga favour teams with fewer home games in midweek. The paper also shows that these soccer-specific findings have some implications for the design of contests in general.
CEPR: Macroeconomics & Growth (Topic), 2020
We reassess the effects of natural resources on economic development and conflict, applying a cau... more We reassess the effects of natural resources on economic development and conflict, applying a causal forest estimator and data from 3,800 Sub-Saharan African districts. We find that, on average, mining activities and higher world market prices of locally mined minerals both increase economic development and conflict. Consistent with the previous literature, mining activities have more positive effects on economic development and weaker effects on conflict in places with low ethnic diversity and high institutional quality. In contrast, the effects of changes in mineral prices vary little in ethnic diversity and institutional quality, but are non-linear and largest at relatively high prices.
The Journal of Industrial Economics, 2017
This paper models payment evasion as a source of profit by letting the firm choose the price char... more This paper models payment evasion as a source of profit by letting the firm choose the price charged to paying consumers and the fine collected from detected payment evaders. The consumers choose whether to purchase, evade payment, or refrain from consumption. We show that payment evasion allows the firm to charge a higher price to paying consumers and to generate a higher profit. We also show that higher fines do not necessarily reduce payment evasion. Finally, we provide empirical evidence which is consistent with our theoretical analysis, using comprehensive micro data on fare dodging on the Zurich Transport Network.
AEA Randomized Controlled Trials, 2017
What is the role of physical activity in the process of human capital accumu-lation? Brain resear... more What is the role of physical activity in the process of human capital accumu-lation? Brain research provides growing evidence of the importance of physical activity for various aspects of cognitive functions. An increasingly sedentary lifestyle could thus be not only harmful to population health, but also disrupt human capital accumulation. This paper analyzes the effects of on-campus recreational sports and exercise on educational outcomes of university students. To identify causal effects, we randomize financial incentives to encourage students' participation in on-campus sports and exercise. The incentives increased participation frequency by 0.26 times per week (47%) and improved grades by 0.14 standard deviations. This effect is primarily driven by male students and students at higher quantiles of the grade distribution. Results from survey data suggest that students substitute off-campus with on-campus physical activities during the day but do not significantly increase the overall frequency. Our findings suggest that students spend more time on campus and are better able to integrate studying and exercising, which may enhance the effectiveness of studying and thus improve student performance.
SSRN Electronic Journal, 2014
Journal of Environmental Economics and Management, 2020
The disclosure of the VW emission manipulation scandal caused a quasi-experimental market shock i... more The disclosure of the VW emission manipulation scandal caused a quasi-experimental market shock in the observable quality of VW diesel vehicles. We consider a classical model for adverse selection and sorting to derive an empirically testable hypothesis about the impact of observable quality on the supply of used cars. We test the hypothesis with data collected from an online car selling platform which reflects about 50% of the German used-car market. The empirical approach is based on a conditional difference-indifferences method. We find that the supply of used VW diesel vehicles increases after the VW emission scandal. This finding is consistent with the predictions of the theoretical model. Furthermore, we find the positive supply effects increase with the probability of manipulation.
SSRN Electronic Journal, 2005
This paper examines the empirical analysis of treatment effects on duration outcomes from data th... more This paper examines the empirical analysis of treatment effects on duration outcomes from data that contain instrumental variation. We focus on social experiments in which an intention to treat is randomized and compliance may be imperfect. We distinguish between cases where the treatment starts at the moment of randomization and cases where it starts at a later point in time. We derive exclusion restrictions under various informational and behavioral assumptions and we analyze identifiability under these restrictions. It turns out that randomization (and by implication, instrumental variation) by itself is often insufficient for inference on interesting effects, and needs to be augmented by a semi-parametric structure. We develop corresponding non-and semi-parametric tests and estimation methods.
We investigate heterogenous employment effects of Flemish training programmes. Based on administr... more We investigate heterogenous employment effects of Flemish training programmes. Based on administrative individual data, we analyse programme effects at various aggregation levels using Modified Causal Forests (MCF), a causal machine learning estimator for multiple programmes. While all programmes have positive effects after the lock-in period, we find substantial heterogeneity across programmes and types of unemployed. Simulations show that assigning unemployed to programmes that maximise individual gains as identified in our estimation can considerably improve effectiveness. Simplified rules, such as one giving priority to unemployed with low employability, mostly recent migrants, lead to about half of the gains obtained by more sophisticated rules.
Uncovering the heterogeneity of causal effects of policies and business decisions at various leve... more Uncovering the heterogeneity of causal effects of policies and business decisions at various levels of granularity provides substantial value to decision makers. This paper develops new estimation and inference procedures for multiple treatment models in a selection-on-observables framework by modifying the Causal Forest approach suggested by Wager and Athey (2018) in several dimensions. The new estimators have desirable theoretical, computational and practical properties for various aggregation levels of the causal effects. While an Empirical Monte Carlo study suggests that they outperform previously suggested estimators, an application to the evaluation of an active labour market programme shows the value of the new methods for applied research.