Giuseppe Porro - Profile on Academia.edu (original) (raw)
Papers by Giuseppe Porro
cem: Coarsened Exact Matching
Replication data for: Multivariate Matching Methods That are Monotonic Imbalance Bounding .DS_Store
Harvard Dataverse, 2011
CEMSIM-PENTA.RDA
used by lalonde-sim.R
Replication data for: Multivariate Matching Methods That are Monotonic Imbalance Bounding
We introduce a new "Monotonic Imbalance Bounding" (MIB) class of matching methods for c... more We introduce a new "Monotonic Imbalance Bounding" (MIB) class of matching methods for causal inference with a surprisingly large number of attractive statistical properties. MIB generalizes and extends in several new directions the only existing class, "Equal Percent Bias Reducing" (EPBR), which is designed to satisfy weaker properties and only in expectation. We also offer strategies to obtain specific members of the MIB class, and analyze in more detail a member of this class, called Coarsened Exact Matching, whose properties we analyze from this new perspective. We offer a variety of analytical results and numerical simulations that demonstrate how members of the MIB class can dramatically improve inferences relative to EPBR-based matching methods. See also: Casual Inference
Research Square (Research Square), Feb 8, 2024
We explore in laboratory how donations to a charity can be influenced by the identifiability and ... more We explore in laboratory how donations to a charity can be influenced by the identifiability and the social categorization of the recipients. We find that donors give more, on average, to unidentified than to identified beneficiaries, since the latter are more likely to receive small donations than the former. Donations are the same, on average, to inand to out-group beneficiaries; however, an in-group recipient is more likely to receive a top or a very small donation than an out-group one, whereas the latter is more likely than the former to receive an intermediate donation. Both first-and second-order effects are associated to the Dynamic Identity Fusion Index elicited from participants toward the 'Multicultural world'.
Replication data for: Multivariate Matching Methods That are Monotonic Imbalance Bounding imb0.rda
Harvard Dataverse, 2011
Sindacato, differenziali retributivi, riformismo. Paolo Santi: scritti e testimonianze
Raccolta dei principali scritti di Paolo Santi e sui temi del sindacato, dei differenziali retrib... more Raccolta dei principali scritti di Paolo Santi e sui temi del sindacato, dei differenziali retributivi e del riformismo. Nell'ultima sezione sono raccolte le testimonianze di chi ha condiviso con Paolo Santi circostanze di vita e di lavoro
Multivariate matching methods that are monotonic imbalance bounding
How to Control for Bias in Social Media
Chapman and Hall/CRC eBooks, Jun 18, 2021
Social Science Research Network, 2006
In this paper we introduce the Random Recursive Partitioning (RRP) matching method. RRP generates... more In this paper we introduce the Random Recursive Partitioning (RRP) matching method. RRP generates a proximity matrix which might be useful in econometric applications like average treatment effect estimation. RRP is a Monte Carlo method that randomly generates non-empty recursive partitions of the data and evaluates the proximity between two observations as the empirical frequency they fall in a same cell of these random partitions over all Monte Carlo replications. From the proximity matrix it is possible to derive both graphical and analytical tools to evaluate the extent of the common support between data sets. The RRP method is "honest" in that it does not match observations "at any cost": if data sets are separated, the method clearly states it. The match obtained with RRP is invariant under monotonic transformation of the data. Average treatment effect estimators derived from the proximity matrix seem to be competitive compared to more commonly used estimators. RRP method does not require a particular structure of the data and for this reason it can be applied when distances like Mahalanobis or Euclidean are not suitable, in the presence of missing data or when the estimated propensity score is too sensitive to model specifications.
Papers in Regional Science, Sep 20, 2017
CERTeT was established at the end of 1995. It focuses on the following research areas: territoria... more CERTeT was established at the end of 1995. It focuses on the following research areas: territorial economics (both regional and urban); transportation economics and the analysis of transport infrastructure (railways, highways, airways, waterways); the economics of tourism (the Centre organizes an advanced course on this subject); the evaluation of regional and local policies, paying specific attention to the use of EU structural funds; and the economics and management of water resources, and their role in local development. The New Working Paper Series circulates research in progress, policy notes, and discussions on economic issues falling within the competences of CERTeT. We host researchers working in "Regional Science" from inside Bocconi University as well as researchers belonging to the wide relational networks built up by CERTeT in its more than 20 years of activity.
RePEc: Research Papers in Economics, 2004
A new matching method is proposed for the estimation of the average treatment effect of social po... more A new matching method is proposed for the estimation of the average treatment effect of social policy interventions (e.g., training programs or health care measures). Given an outcome variable, a treatment and a set of pre-treatment covariates, the method is based on the examination of random recursive partitions of the space of covariates using regression trees. A regression tree is grown either on the treated or on the untreated individuals only using as response variable a random permutation of the indexes 1. . . n (n being the number of units involved), while the indexes for the other group are predicted using this tree. The procedure is replicated in order to rule out the effect of specific permutations. The average treatment effect is estimated in each tree by matching treated and untreated in the same terminal nodes. The final estimator of the average treatment effect is obtained by averaging on all the trees grown. The method does not require any specific model assumption apart from the tree's complexity, which does not affect the estimator though. We show that this method is either an instrument to check whether two samples can be matched (by any method) and, when this is feasible, to obtain reliable estimates of the average treatment effect. We further propose a graphical tool to inspect the quality of the match.
Random Recursive Partitiong and Rank-based proximities for data matching, missing data imputation and nonparametric classification and prediction
Data matching is a typical statistical problem in non experimental and/or observational studies o... more Data matching is a typical statistical problem in non experimental and/or observational studies or, more generally, in cross-sectional studies in which one or more data sets are to be compared. Several methods are available in the literature, most of which based on a particular metric or on statistical models, either parametric or nonparametric. We present two methods to calculate a proximity which have the property of being invariant under monotonic transformations. These methods require at most the notion of ordering. We provide an open-source software in the form of a R package. The software is available at: https://r-forge.r-project.org/projects/rrp/ See also: PORRO G., IACUS S.M (2008). Invariant and metric free proximities for data matching: an R package. JOURNAL OF STATISTICAL SOFTWARE., vol. 25 (11), p. 1-22, ISSN: 1548-766
The growing usage of social media by a wider audience of citizens sharply increases the possibili... more The growing usage of social media by a wider audience of citizens sharply increases the possibility to investigate the web as a device to explore and track policy preferences. In the present paper we apply the recent method proposed in to three different scenarios, by analyzing on one side the on-line popularity of Italian political leaders throughout 2011, and on the other the voting intention of French internet-users in both the 2012 Presidential ballot and in the subsequent Legislative election. Despite internet users are not necessarily representative of the whole population of country's citizens, our analysis shows a remarkable ability of social-media to forecast electoral results as well as a noteworthy correlation between social-media and traditional mass surveys results. We also illustrate that the predictive ability of social-media analysis strengthens as the number of citizens' expressing online their opinion increases, provided they act consistently on that (i.e. apart from high abstention rates).
RePEc: Research Papers in Economics, 2008
Il lavoro interinale in Italia: uno sguardo all'oerta
RePEc: Research Papers in Economics, 2002
L'esame dell'archivio amministrativo di una delle maggiori società di fornitura... more L'esame dell'archivio amministrativo di una delle maggiori società di fornitura di lavoro interinale consente di osservare le specificità dell'offerta di lavoro temporaneo, rimaste in ombra negli studi realizzati finora sul lavoro temporaneo in Italia. I dati esaminati sono relativi alla fase di decollo delle società interinali (1998-2000). Attraverso un modello econometrico viene analizzato l'impatto delle caratteristiche dell'offerta (demografiche e di capitale umano) sulla probabilit`a di ottenere un avviamento al lavoro temporaneo. Evidenziando la rilevanza di alcune variabili relative al capitale umano e all'esperienza professionale, lo studio fa emergere il ruolodella società di fornitura nel promuovere l'occupabilità dei candidati al lavoro interinale. Alcuni dati sulle dinamiche salariali - lungi dal costituire una compiuta analisi degli aspetti retributivi del fenomeno interinale - confermano tuttavia le intuizioni dell'analisi econometrica.
RePEc: Research Papers in Economics, 2004
Significance, May 24, 2022
Extracting Subjective Well-Being from Textual Data
Chapman and Hall/CRC eBooks, Jun 18, 2021
arXiv (Cornell University), Dec 22, 2015
The literature on well being measurement seems to suggest that "asking" for a self-evaluation is ... more The literature on well being measurement seems to suggest that "asking" for a self-evaluation is the only way to estimate a complete and reliable measure of well being. At the same time "not asking" is the only way to avoid biased evaluations due to self-reporting. Here we propose a method for estimating the welfare perception of a community simply "listening" to the conversations on Social Network Sites. The Social Well Being Index (SWBI) and its components are proposed through to an innovative technique of supervised sentiment analysis called iSA which scales to any language and big data. As main methodological advantages, this approach can estimate several aspects of social well being directly from self-declared perceptions, instead of approximating it through objective (but partial) quantitative variables like GDP; moreover self-perceptions of welfare are spontaneous and not obtained as answers to explicit questions that are proved to bias the result. As an application we evaluate the SWBI in Italy through the period 2012-2015 through the analysis of more than 143 millions of tweets.
cem: Coarsened Exact Matching
Replication data for: Multivariate Matching Methods That are Monotonic Imbalance Bounding .DS_Store
Harvard Dataverse, 2011
CEMSIM-PENTA.RDA
used by lalonde-sim.R
Replication data for: Multivariate Matching Methods That are Monotonic Imbalance Bounding
We introduce a new "Monotonic Imbalance Bounding" (MIB) class of matching methods for c... more We introduce a new "Monotonic Imbalance Bounding" (MIB) class of matching methods for causal inference with a surprisingly large number of attractive statistical properties. MIB generalizes and extends in several new directions the only existing class, "Equal Percent Bias Reducing" (EPBR), which is designed to satisfy weaker properties and only in expectation. We also offer strategies to obtain specific members of the MIB class, and analyze in more detail a member of this class, called Coarsened Exact Matching, whose properties we analyze from this new perspective. We offer a variety of analytical results and numerical simulations that demonstrate how members of the MIB class can dramatically improve inferences relative to EPBR-based matching methods. See also: Casual Inference
Research Square (Research Square), Feb 8, 2024
We explore in laboratory how donations to a charity can be influenced by the identifiability and ... more We explore in laboratory how donations to a charity can be influenced by the identifiability and the social categorization of the recipients. We find that donors give more, on average, to unidentified than to identified beneficiaries, since the latter are more likely to receive small donations than the former. Donations are the same, on average, to inand to out-group beneficiaries; however, an in-group recipient is more likely to receive a top or a very small donation than an out-group one, whereas the latter is more likely than the former to receive an intermediate donation. Both first-and second-order effects are associated to the Dynamic Identity Fusion Index elicited from participants toward the 'Multicultural world'.
Replication data for: Multivariate Matching Methods That are Monotonic Imbalance Bounding imb0.rda
Harvard Dataverse, 2011
Sindacato, differenziali retributivi, riformismo. Paolo Santi: scritti e testimonianze
Raccolta dei principali scritti di Paolo Santi e sui temi del sindacato, dei differenziali retrib... more Raccolta dei principali scritti di Paolo Santi e sui temi del sindacato, dei differenziali retributivi e del riformismo. Nell'ultima sezione sono raccolte le testimonianze di chi ha condiviso con Paolo Santi circostanze di vita e di lavoro
Multivariate matching methods that are monotonic imbalance bounding
How to Control for Bias in Social Media
Chapman and Hall/CRC eBooks, Jun 18, 2021
Social Science Research Network, 2006
In this paper we introduce the Random Recursive Partitioning (RRP) matching method. RRP generates... more In this paper we introduce the Random Recursive Partitioning (RRP) matching method. RRP generates a proximity matrix which might be useful in econometric applications like average treatment effect estimation. RRP is a Monte Carlo method that randomly generates non-empty recursive partitions of the data and evaluates the proximity between two observations as the empirical frequency they fall in a same cell of these random partitions over all Monte Carlo replications. From the proximity matrix it is possible to derive both graphical and analytical tools to evaluate the extent of the common support between data sets. The RRP method is "honest" in that it does not match observations "at any cost": if data sets are separated, the method clearly states it. The match obtained with RRP is invariant under monotonic transformation of the data. Average treatment effect estimators derived from the proximity matrix seem to be competitive compared to more commonly used estimators. RRP method does not require a particular structure of the data and for this reason it can be applied when distances like Mahalanobis or Euclidean are not suitable, in the presence of missing data or when the estimated propensity score is too sensitive to model specifications.
Papers in Regional Science, Sep 20, 2017
CERTeT was established at the end of 1995. It focuses on the following research areas: territoria... more CERTeT was established at the end of 1995. It focuses on the following research areas: territorial economics (both regional and urban); transportation economics and the analysis of transport infrastructure (railways, highways, airways, waterways); the economics of tourism (the Centre organizes an advanced course on this subject); the evaluation of regional and local policies, paying specific attention to the use of EU structural funds; and the economics and management of water resources, and their role in local development. The New Working Paper Series circulates research in progress, policy notes, and discussions on economic issues falling within the competences of CERTeT. We host researchers working in "Regional Science" from inside Bocconi University as well as researchers belonging to the wide relational networks built up by CERTeT in its more than 20 years of activity.
RePEc: Research Papers in Economics, 2004
A new matching method is proposed for the estimation of the average treatment effect of social po... more A new matching method is proposed for the estimation of the average treatment effect of social policy interventions (e.g., training programs or health care measures). Given an outcome variable, a treatment and a set of pre-treatment covariates, the method is based on the examination of random recursive partitions of the space of covariates using regression trees. A regression tree is grown either on the treated or on the untreated individuals only using as response variable a random permutation of the indexes 1. . . n (n being the number of units involved), while the indexes for the other group are predicted using this tree. The procedure is replicated in order to rule out the effect of specific permutations. The average treatment effect is estimated in each tree by matching treated and untreated in the same terminal nodes. The final estimator of the average treatment effect is obtained by averaging on all the trees grown. The method does not require any specific model assumption apart from the tree's complexity, which does not affect the estimator though. We show that this method is either an instrument to check whether two samples can be matched (by any method) and, when this is feasible, to obtain reliable estimates of the average treatment effect. We further propose a graphical tool to inspect the quality of the match.
Random Recursive Partitiong and Rank-based proximities for data matching, missing data imputation and nonparametric classification and prediction
Data matching is a typical statistical problem in non experimental and/or observational studies o... more Data matching is a typical statistical problem in non experimental and/or observational studies or, more generally, in cross-sectional studies in which one or more data sets are to be compared. Several methods are available in the literature, most of which based on a particular metric or on statistical models, either parametric or nonparametric. We present two methods to calculate a proximity which have the property of being invariant under monotonic transformations. These methods require at most the notion of ordering. We provide an open-source software in the form of a R package. The software is available at: https://r-forge.r-project.org/projects/rrp/ See also: PORRO G., IACUS S.M (2008). Invariant and metric free proximities for data matching: an R package. JOURNAL OF STATISTICAL SOFTWARE., vol. 25 (11), p. 1-22, ISSN: 1548-766
The growing usage of social media by a wider audience of citizens sharply increases the possibili... more The growing usage of social media by a wider audience of citizens sharply increases the possibility to investigate the web as a device to explore and track policy preferences. In the present paper we apply the recent method proposed in to three different scenarios, by analyzing on one side the on-line popularity of Italian political leaders throughout 2011, and on the other the voting intention of French internet-users in both the 2012 Presidential ballot and in the subsequent Legislative election. Despite internet users are not necessarily representative of the whole population of country's citizens, our analysis shows a remarkable ability of social-media to forecast electoral results as well as a noteworthy correlation between social-media and traditional mass surveys results. We also illustrate that the predictive ability of social-media analysis strengthens as the number of citizens' expressing online their opinion increases, provided they act consistently on that (i.e. apart from high abstention rates).
RePEc: Research Papers in Economics, 2008
Il lavoro interinale in Italia: uno sguardo all'oerta
RePEc: Research Papers in Economics, 2002
L'esame dell'archivio amministrativo di una delle maggiori società di fornitura... more L'esame dell'archivio amministrativo di una delle maggiori società di fornitura di lavoro interinale consente di osservare le specificità dell'offerta di lavoro temporaneo, rimaste in ombra negli studi realizzati finora sul lavoro temporaneo in Italia. I dati esaminati sono relativi alla fase di decollo delle società interinali (1998-2000). Attraverso un modello econometrico viene analizzato l'impatto delle caratteristiche dell'offerta (demografiche e di capitale umano) sulla probabilit`a di ottenere un avviamento al lavoro temporaneo. Evidenziando la rilevanza di alcune variabili relative al capitale umano e all'esperienza professionale, lo studio fa emergere il ruolodella società di fornitura nel promuovere l'occupabilità dei candidati al lavoro interinale. Alcuni dati sulle dinamiche salariali - lungi dal costituire una compiuta analisi degli aspetti retributivi del fenomeno interinale - confermano tuttavia le intuizioni dell'analisi econometrica.
RePEc: Research Papers in Economics, 2004
Significance, May 24, 2022
Extracting Subjective Well-Being from Textual Data
Chapman and Hall/CRC eBooks, Jun 18, 2021
arXiv (Cornell University), Dec 22, 2015
The literature on well being measurement seems to suggest that "asking" for a self-evaluation is ... more The literature on well being measurement seems to suggest that "asking" for a self-evaluation is the only way to estimate a complete and reliable measure of well being. At the same time "not asking" is the only way to avoid biased evaluations due to self-reporting. Here we propose a method for estimating the welfare perception of a community simply "listening" to the conversations on Social Network Sites. The Social Well Being Index (SWBI) and its components are proposed through to an innovative technique of supervised sentiment analysis called iSA which scales to any language and big data. As main methodological advantages, this approach can estimate several aspects of social well being directly from self-declared perceptions, instead of approximating it through objective (but partial) quantitative variables like GDP; moreover self-perceptions of welfare are spontaneous and not obtained as answers to explicit questions that are proved to bias the result. As an application we evaluate the SWBI in Italy through the period 2012-2015 through the analysis of more than 143 millions of tweets.