Susan Athey | Stanford University (original) (raw)
Papers by Susan Athey
Forest-based methods are being used in an increasing variety of statistical tasks, including caus... more Forest-based methods are being used in an increasing variety of statistical tasks, including causal inference, survival analysis, and quantile regression. Extending forest-based methods to these new statistical settings requires specifying tree-growing algorithms that are targeted to the task at hand, and the ad-hoc design of such algorithms can require considerable effort. In this paper, we develop a unified framework for the design of fast tree-growing procedures for tasks that can be characterized by heterogeneous estimating equations. The resulting gradient forest consists of trees grown by recursively applying a pre-processing step where we label each observation with gradient-based pseudo-outcomes, followed by a regression step that runs a standard CART regression split on these pseudo-outcomes. We apply our framework to two important statistical problems, non-parametric quantile regression and heterogeneous treatment effect estimation via instrumental variables, and we show t...
ArXiv, 2017
We consider the problem of using observational data to learn treatment assignment policies that s... more We consider the problem of using observational data to learn treatment assignment policies that satisfy certain constraints specified by a practitioner, such as budget, fairness, or functional form constraints. This problem has previously been studied in economics, statistics, and computer science, and several regret-consistent methods have been proposed. However, several key analytical components are missing, including a characterization of optimal methods for policy learning, and sharp bounds for minimax regret. In this paper, we derive lower bounds for the minimax regret of policy learning under constraints, and propose a method that attains this bound asymptotically up to a constant factor. Whenever the class of policies under consideration has a bounded Vapnik-Chervonenkis dimension, we show that the problem of minimax-regret policy learning can be asymptotically reduced to first efficiently evaluating how much each candidate policy improves over a randomized baseline, and then...
Journal of Computational and Graphical Statistics, 2020
SSRN Electronic Journal, 2019
Econometrica, 2020
Consider a researcher estimating the parameters of a regression function based on data for all 50... more Consider a researcher estimating the parameters of a regression function based on data for all 50 states in the United States or on data for all visits to a website. What is the interpretation of the estimated parameters and the standard errors? In practice, researchers typically assume that the sample is randomly drawn from a large population of interest and report standard errors that are designed to capture sampling variation. This is common even in applications where it is difficult to articulate what that population of interest is, and how it differs from the sample. In this article, we explore an alternative approach to inference, which is partly design‐based. In a design‐based setting, the values of some of the regressors can be manipulated, perhaps through a policy intervention. Design‐based uncertainty emanates from lack of knowledge about the values that the regression outcome would have taken under alternative interventions. We derive standard errors that account for desi...
Annual Review of Economics, 2019
We discuss the relevance of the recent machine learning (ML) literature for economics and econome... more We discuss the relevance of the recent machine learning (ML) literature for economics and econometrics. First we discuss the differences in goals, methods, and settings between the ML literature and the traditional econometrics and statistics literatures. Then we discuss some specific methods from the ML literature that we view as important for empirical researchers in economics. These include supervised learning methods for regression and classification, unsupervised learning methods, and matrix completion methods. Finally, we highlight newly developed methods at the intersection of ML and econometrics that typically perform better than either off-the-shelf ML or more traditional econometric methods when applied to particular classes of problems, including causal inference for average treatment effects, optimal policy estimation, and estimation of the counterfactual effect of price changes in consumer choice models.
The Annals of Statistics, 2019
SSRN Electronic Journal, 2020
This paper studies discriminatory and uniform price auctions, the two most com-mon "multi-un... more This paper studies discriminatory and uniform price auctions, the two most com-mon "multi-unit auctions" for selling multiple identical objects. In such auctions, the distribution of bidder values is only partially identified from the distribution of bids. Given (asymmetric unobserved) correlated private values, sufficient conditions are pro-vided for a given bid distribution to be rationalized by equilibrium behavior. Given independent private values, all value distributions that rationalize the data are identi-fied. Given non-increasing marginal values, the best response hypothesis can be tested and lower bounds obtained on the extent to which each bidder fails to play a best response.
This research was conducted while both researchers were Consulting Researchers to Microsoft Resea... more This research was conducted while both researchers were Consulting Researchers to Microsoft Research. Support was also provided by the Toulouse Network for Information Technology. This paper represents the research of the authors and does not reflect the views of any institution or organization. This paper has benefited greatly from seminar comments at the NBER Economics of Digitization Conference,
Workshop on Media …, 2010
Page 1. Will the Internet Destroy the News Media? or Can Online Advertising Markets Save the Medi... more Page 1. Will the Internet Destroy the News Media? or Can Online Advertising Markets Save the Media? Susan Athey Emilio Calvano Joshua Gans Discussion Points Alessandro Bonatti, MIT Sloan Workshop on Media Economics and Public Policy October 15, 2010 Page 2. ...
We study entry and bidding patterns in sealed bid and open auctions with heterogeneous bidders. U... more We study entry and bidding patterns in sealed bid and open auctions with heterogeneous bidders. Using data from U.S. Forest Service timber auctions, we document a set of systematic effects of auction format: sealed bid auctions attract more small bidders, shift the allocation towards these bidders, and can also generate higher revenue. We show that a private value auction model
This paper develops tools for analyzing properties of stochastic objective functions that take th... more This paper develops tools for analyzing properties of stochastic objective functions that take the form ( , ) ( , ) ( ; ) Vu dF ≡ ∫s xx s s θ θ . The paper analyzes the relationship between properties of the primitive functions, the utility function u and probability distribution F, and properties of the stochastic objective. The
Abstract This chapter discusses structural econometric approaches to auctions. Remarkably, much o... more Abstract This chapter discusses structural econometric approaches to auctions. Remarkably, much of what can be learned from auction data can be learned without restrictions beyond those derived from the relevant economic model. This enables us to take a nonparametric ...
Forest-based methods are being used in an increasing variety of statistical tasks, including caus... more Forest-based methods are being used in an increasing variety of statistical tasks, including causal inference, survival analysis, and quantile regression. Extending forest-based methods to these new statistical settings requires specifying tree-growing algorithms that are targeted to the task at hand, and the ad-hoc design of such algorithms can require considerable effort. In this paper, we develop a unified framework for the design of fast tree-growing procedures for tasks that can be characterized by heterogeneous estimating equations. The resulting gradient forest consists of trees grown by recursively applying a pre-processing step where we label each observation with gradient-based pseudo-outcomes, followed by a regression step that runs a standard CART regression split on these pseudo-outcomes. We apply our framework to two important statistical problems, non-parametric quantile regression and heterogeneous treatment effect estimation via instrumental variables, and we show t...
ArXiv, 2017
We consider the problem of using observational data to learn treatment assignment policies that s... more We consider the problem of using observational data to learn treatment assignment policies that satisfy certain constraints specified by a practitioner, such as budget, fairness, or functional form constraints. This problem has previously been studied in economics, statistics, and computer science, and several regret-consistent methods have been proposed. However, several key analytical components are missing, including a characterization of optimal methods for policy learning, and sharp bounds for minimax regret. In this paper, we derive lower bounds for the minimax regret of policy learning under constraints, and propose a method that attains this bound asymptotically up to a constant factor. Whenever the class of policies under consideration has a bounded Vapnik-Chervonenkis dimension, we show that the problem of minimax-regret policy learning can be asymptotically reduced to first efficiently evaluating how much each candidate policy improves over a randomized baseline, and then...
Journal of Computational and Graphical Statistics, 2020
SSRN Electronic Journal, 2019
Econometrica, 2020
Consider a researcher estimating the parameters of a regression function based on data for all 50... more Consider a researcher estimating the parameters of a regression function based on data for all 50 states in the United States or on data for all visits to a website. What is the interpretation of the estimated parameters and the standard errors? In practice, researchers typically assume that the sample is randomly drawn from a large population of interest and report standard errors that are designed to capture sampling variation. This is common even in applications where it is difficult to articulate what that population of interest is, and how it differs from the sample. In this article, we explore an alternative approach to inference, which is partly design‐based. In a design‐based setting, the values of some of the regressors can be manipulated, perhaps through a policy intervention. Design‐based uncertainty emanates from lack of knowledge about the values that the regression outcome would have taken under alternative interventions. We derive standard errors that account for desi...
Annual Review of Economics, 2019
We discuss the relevance of the recent machine learning (ML) literature for economics and econome... more We discuss the relevance of the recent machine learning (ML) literature for economics and econometrics. First we discuss the differences in goals, methods, and settings between the ML literature and the traditional econometrics and statistics literatures. Then we discuss some specific methods from the ML literature that we view as important for empirical researchers in economics. These include supervised learning methods for regression and classification, unsupervised learning methods, and matrix completion methods. Finally, we highlight newly developed methods at the intersection of ML and econometrics that typically perform better than either off-the-shelf ML or more traditional econometric methods when applied to particular classes of problems, including causal inference for average treatment effects, optimal policy estimation, and estimation of the counterfactual effect of price changes in consumer choice models.
The Annals of Statistics, 2019
SSRN Electronic Journal, 2020
This paper studies discriminatory and uniform price auctions, the two most com-mon "multi-un... more This paper studies discriminatory and uniform price auctions, the two most com-mon "multi-unit auctions" for selling multiple identical objects. In such auctions, the distribution of bidder values is only partially identified from the distribution of bids. Given (asymmetric unobserved) correlated private values, sufficient conditions are pro-vided for a given bid distribution to be rationalized by equilibrium behavior. Given independent private values, all value distributions that rationalize the data are identi-fied. Given non-increasing marginal values, the best response hypothesis can be tested and lower bounds obtained on the extent to which each bidder fails to play a best response.
This research was conducted while both researchers were Consulting Researchers to Microsoft Resea... more This research was conducted while both researchers were Consulting Researchers to Microsoft Research. Support was also provided by the Toulouse Network for Information Technology. This paper represents the research of the authors and does not reflect the views of any institution or organization. This paper has benefited greatly from seminar comments at the NBER Economics of Digitization Conference,
Workshop on Media …, 2010
Page 1. Will the Internet Destroy the News Media? or Can Online Advertising Markets Save the Medi... more Page 1. Will the Internet Destroy the News Media? or Can Online Advertising Markets Save the Media? Susan Athey Emilio Calvano Joshua Gans Discussion Points Alessandro Bonatti, MIT Sloan Workshop on Media Economics and Public Policy October 15, 2010 Page 2. ...
We study entry and bidding patterns in sealed bid and open auctions with heterogeneous bidders. U... more We study entry and bidding patterns in sealed bid and open auctions with heterogeneous bidders. Using data from U.S. Forest Service timber auctions, we document a set of systematic effects of auction format: sealed bid auctions attract more small bidders, shift the allocation towards these bidders, and can also generate higher revenue. We show that a private value auction model
This paper develops tools for analyzing properties of stochastic objective functions that take th... more This paper develops tools for analyzing properties of stochastic objective functions that take the form ( , ) ( , ) ( ; ) Vu dF ≡ ∫s xx s s θ θ . The paper analyzes the relationship between properties of the primitive functions, the utility function u and probability distribution F, and properties of the stochastic objective. The
Abstract This chapter discusses structural econometric approaches to auctions. Remarkably, much o... more Abstract This chapter discusses structural econometric approaches to auctions. Remarkably, much of what can be learned from auction data can be learned without restrictions beyond those derived from the relevant economic model. This enables us to take a nonparametric ...