Giuseppe Porro - Profile on Academia.edu (original) (raw)

Papers by Giuseppe Porro

Research paper thumbnail of cem: Coarsened Exact Matching

cem: Coarsened Exact Matching

Research paper thumbnail of Replication data for: Multivariate Matching Methods That are Monotonic Imbalance Bounding .DS_Store

Replication data for: Multivariate Matching Methods That are Monotonic Imbalance Bounding .DS_Store

Harvard Dataverse, 2011

Research paper thumbnail of CEMSIM-PENTA.RDA

CEMSIM-PENTA.RDA

used by lalonde-sim.R

Research paper thumbnail of Replication data for: Multivariate Matching Methods That are Monotonic Imbalance Bounding

Replication data for: Multivariate Matching Methods That are Monotonic Imbalance Bounding

We introduce a new "Monotonic Imbalance Bounding" (MIB) class of matching methods for c... more We introduce a new "Monotonic Imbalance Bounding" (MIB) class of matching methods for causal inference with a surprisingly large number of attractive statistical properties. MIB generalizes and extends in several new directions the only existing class, "Equal Percent Bias Reducing" (EPBR), which is designed to satisfy weaker properties and only in expectation. We also offer strategies to obtain specific members of the MIB class, and analyze in more detail a member of this class, called Coarsened Exact Matching, whose properties we analyze from this new perspective. We offer a variety of analytical results and numerical simulations that demonstrate how members of the MIB class can dramatically improve inferences relative to EPBR-based matching methods. See also: Casual Inference

Research paper thumbnail of Victim’s Identification and Social Categorization: First- and Second-Order Effects on Altruistic Behavior

Research Square (Research Square), Feb 8, 2024

We explore in laboratory how donations to a charity can be influenced by the identifiability and ... more We explore in laboratory how donations to a charity can be influenced by the identifiability and the social categorization of the recipients. We find that donors give more, on average, to unidentified than to identified beneficiaries, since the latter are more likely to receive small donations than the former. Donations are the same, on average, to inand to out-group beneficiaries; however, an in-group recipient is more likely to receive a top or a very small donation than an out-group one, whereas the latter is more likely than the former to receive an intermediate donation. Both first-and second-order effects are associated to the Dynamic Identity Fusion Index elicited from participants toward the 'Multicultural world'.

Research paper thumbnail of Replication data for: Multivariate Matching Methods That are Monotonic Imbalance Bounding imb0.rda

Replication data for: Multivariate Matching Methods That are Monotonic Imbalance Bounding imb0.rda

Harvard Dataverse, 2011

Research paper thumbnail of Sindacato, differenziali retributivi, riformismo. Paolo Santi: scritti e testimonianze

Sindacato, differenziali retributivi, riformismo. Paolo Santi: scritti e testimonianze

Raccolta dei principali scritti di Paolo Santi e sui temi del sindacato, dei differenziali retrib... more Raccolta dei principali scritti di Paolo Santi e sui temi del sindacato, dei differenziali retributivi e del riformismo. Nell'ultima sezione sono raccolte le testimonianze di chi ha condiviso con Paolo Santi circostanze di vita e di lavoro

Research paper thumbnail of Multivariate matching methods that are monotonic imbalance bounding

Multivariate matching methods that are monotonic imbalance bounding

Research paper thumbnail of How to Control for Bias in Social Media

How to Control for Bias in Social Media

Chapman and Hall/CRC eBooks, Jun 18, 2021

Research paper thumbnail of Random Recursive Partitioning: A Matching Method for the Estimation of the Average Treatment Effect

Social Science Research Network, 2006

In this paper we introduce the Random Recursive Partitioning (RRP) matching method. RRP generates... more In this paper we introduce the Random Recursive Partitioning (RRP) matching method. RRP generates a proximity matrix which might be useful in econometric applications like average treatment effect estimation. RRP is a Monte Carlo method that randomly generates non-empty recursive partitions of the data and evaluates the proximity between two observations as the empirical frequency they fall in a same cell of these random partitions over all Monte Carlo replications. From the proximity matrix it is possible to derive both graphical and analytical tools to evaluate the extent of the common support between data sets. The RRP method is "honest" in that it does not match observations "at any cost": if data sets are separated, the method clearly states it. The match obtained with RRP is invariant under monotonic transformation of the data. Average treatment effect estimators derived from the proximity matrix seem to be competitive compared to more commonly used estimators. RRP method does not require a particular structure of the data and for this reason it can be applied when distances like Mahalanobis or Euclidean are not suitable, in the presence of missing data or when the estimated propensity score is too sensitive to model specifications.

Research paper thumbnail of Do local subsidies to firms create jobs? Counterfactual evaluation of an Italian regional experience

Papers in Regional Science, Sep 20, 2017

CERTeT was established at the end of 1995. It focuses on the following research areas: territoria... more CERTeT was established at the end of 1995. It focuses on the following research areas: territorial economics (both regional and urban); transportation economics and the analysis of transport infrastructure (railways, highways, airways, waterways); the economics of tourism (the Centre organizes an advanced course on this subject); the evaluation of regional and local policies, paying specific attention to the use of EU structural funds; and the economics and management of water resources, and their role in local development. The New Working Paper Series circulates research in progress, policy notes, and discussions on economic issues falling within the competences of CERTeT. We host researchers working in "Regional Science" from inside Bocconi University as well as researchers belonging to the wide relational networks built up by CERTeT in its more than 20 years of activity.

Research paper thumbnail of Average treatment effect estimation via random recursive partitioning

RePEc: Research Papers in Economics, 2004

A new matching method is proposed for the estimation of the average treatment effect of social po... more A new matching method is proposed for the estimation of the average treatment effect of social policy interventions (e.g., training programs or health care measures). Given an outcome variable, a treatment and a set of pre-treatment covariates, the method is based on the examination of random recursive partitions of the space of covariates using regression trees. A regression tree is grown either on the treated or on the untreated individuals only using as response variable a random permutation of the indexes 1. . . n (n being the number of units involved), while the indexes for the other group are predicted using this tree. The procedure is replicated in order to rule out the effect of specific permutations. The average treatment effect is estimated in each tree by matching treated and untreated in the same terminal nodes. The final estimator of the average treatment effect is obtained by averaging on all the trees grown. The method does not require any specific model assumption apart from the tree's complexity, which does not affect the estimator though. We show that this method is either an instrument to check whether two samples can be matched (by any method) and, when this is feasible, to obtain reliable estimates of the average treatment effect. We further propose a graphical tool to inspect the quality of the match.

Research paper thumbnail of Random Recursive Partitiong and Rank-based proximities for data matching, missing data imputation and nonparametric classification and prediction

Random Recursive Partitiong and Rank-based proximities for data matching, missing data imputation and nonparametric classification and prediction

Data matching is a typical statistical problem in non experimental and/or observational studies o... more Data matching is a typical statistical problem in non experimental and/or observational studies or, more generally, in cross-sectional studies in which one or more data sets are to be compared. Several methods are available in the literature, most of which based on a particular metric or on statistical models, either parametric or nonparametric. We present two methods to calculate a proximity which have the property of being invariant under monotonic transformations. These methods require at most the notion of ordering. We provide an open-source software in the form of a R package. The software is available at: https://r-forge.r-project.org/projects/rrp/ See also: PORRO G., IACUS S.M (2008). Invariant and metric free proximities for data matching: an R package. JOURNAL OF STATISTICAL SOFTWARE., vol. 25 (11), p. 1-22, ISSN: 1548-766

Research paper thumbnail of How Sentiment Analysis of Social Networks Can Improve Our Knowledge of Citizens' Policy Preferences. An Application to Italy and France

The growing usage of social media by a wider audience of citizens sharply increases the possibili... more The growing usage of social media by a wider audience of citizens sharply increases the possibility to investigate the web as a device to explore and track policy preferences. In the present paper we apply the recent method proposed in to three different scenarios, by analyzing on one side the on-line popularity of Italian political leaders throughout 2011, and on the other the voting intention of French internet-users in both the 2012 Presidential ballot and in the subsequent Legislative election. Despite internet users are not necessarily representative of the whole population of country's citizens, our analysis shows a remarkable ability of social-media to forecast electoral results as well as a noteworthy correlation between social-media and traditional mass surveys results. We also illustrate that the predictive ability of social-media analysis strengthens as the number of citizens' expressing online their opinion increases, provided they act consistently on that (i.e. apart from high abstention rates).

Research paper thumbnail of Cem: Stata Module to Perform Coarsened Exact Matching

RePEc: Research Papers in Economics, 2008

Research paper thumbnail of Il lavoro interinale in Italia: uno sguardo all'oerta

Il lavoro interinale in Italia: uno sguardo all'oerta

RePEc: Research Papers in Economics, 2002

L'esame dell'archivio amministrativo di una delle maggiori società di fornitura... more L'esame dell'archivio amministrativo di una delle maggiori società di fornitura di lavoro interinale consente di osservare le specificità dell'offerta di lavoro temporaneo, rimaste in ombra negli studi realizzati finora sul lavoro temporaneo in Italia. I dati esaminati sono relativi alla fase di decollo delle società interinali (1998-2000). Attraverso un modello econometrico viene analizzato l'impatto delle caratteristiche dell'offerta (demografiche e di capitale umano) sulla probabilit`a di ottenere un avviamento al lavoro temporaneo. Evidenziando la rilevanza di alcune variabili relative al capitale umano e all'esperienza professionale, lo studio fa emergere il ruolodella società di fornitura nel promuovere l'occupabilità dei candidati al lavoro interinale. Alcuni dati sulle dinamiche salariali - lungi dal costituire una compiuta analisi degli aspetti retributivi del fenomeno interinale - confermano tuttavia le intuizioni dell'analisi econometrica.

Research paper thumbnail of Formazione e percorsi lavorativi dei laureati dell'Università degli Studi di Milano (IIa edizione: laureati 1999)

RePEc: Research Papers in Economics, 2004

Research paper thumbnail of Don't Ask, Just Listen … Using Social Networks to Measure Subjective Well-Being

Significance, May 24, 2022

Research paper thumbnail of Extracting Subjective Well-Being from Textual Data

Extracting Subjective Well-Being from Textual Data

Chapman and Hall/CRC eBooks, Jun 18, 2021

Research paper thumbnail of Measuring Social Well Being in The Big Data Era: Asking or Listening?

arXiv (Cornell University), Dec 22, 2015

The literature on well being measurement seems to suggest that "asking" for a self-evaluation is ... more The literature on well being measurement seems to suggest that "asking" for a self-evaluation is the only way to estimate a complete and reliable measure of well being. At the same time "not asking" is the only way to avoid biased evaluations due to self-reporting. Here we propose a method for estimating the welfare perception of a community simply "listening" to the conversations on Social Network Sites. The Social Well Being Index (SWBI) and its components are proposed through to an innovative technique of supervised sentiment analysis called iSA which scales to any language and big data. As main methodological advantages, this approach can estimate several aspects of social well being directly from self-declared perceptions, instead of approximating it through objective (but partial) quantitative variables like GDP; moreover self-perceptions of welfare are spontaneous and not obtained as answers to explicit questions that are proved to bias the result. As an application we evaluate the SWBI in Italy through the period 2012-2015 through the analysis of more than 143 millions of tweets.

Research paper thumbnail of cem: Coarsened Exact Matching

cem: Coarsened Exact Matching

Research paper thumbnail of Replication data for: Multivariate Matching Methods That are Monotonic Imbalance Bounding .DS_Store

Replication data for: Multivariate Matching Methods That are Monotonic Imbalance Bounding .DS_Store

Harvard Dataverse, 2011

Research paper thumbnail of CEMSIM-PENTA.RDA

CEMSIM-PENTA.RDA

used by lalonde-sim.R

Research paper thumbnail of Replication data for: Multivariate Matching Methods That are Monotonic Imbalance Bounding

Replication data for: Multivariate Matching Methods That are Monotonic Imbalance Bounding

We introduce a new "Monotonic Imbalance Bounding" (MIB) class of matching methods for c... more We introduce a new "Monotonic Imbalance Bounding" (MIB) class of matching methods for causal inference with a surprisingly large number of attractive statistical properties. MIB generalizes and extends in several new directions the only existing class, "Equal Percent Bias Reducing" (EPBR), which is designed to satisfy weaker properties and only in expectation. We also offer strategies to obtain specific members of the MIB class, and analyze in more detail a member of this class, called Coarsened Exact Matching, whose properties we analyze from this new perspective. We offer a variety of analytical results and numerical simulations that demonstrate how members of the MIB class can dramatically improve inferences relative to EPBR-based matching methods. See also: Casual Inference

Research paper thumbnail of Victim’s Identification and Social Categorization: First- and Second-Order Effects on Altruistic Behavior

Research Square (Research Square), Feb 8, 2024

We explore in laboratory how donations to a charity can be influenced by the identifiability and ... more We explore in laboratory how donations to a charity can be influenced by the identifiability and the social categorization of the recipients. We find that donors give more, on average, to unidentified than to identified beneficiaries, since the latter are more likely to receive small donations than the former. Donations are the same, on average, to inand to out-group beneficiaries; however, an in-group recipient is more likely to receive a top or a very small donation than an out-group one, whereas the latter is more likely than the former to receive an intermediate donation. Both first-and second-order effects are associated to the Dynamic Identity Fusion Index elicited from participants toward the 'Multicultural world'.

Research paper thumbnail of Replication data for: Multivariate Matching Methods That are Monotonic Imbalance Bounding imb0.rda

Replication data for: Multivariate Matching Methods That are Monotonic Imbalance Bounding imb0.rda

Harvard Dataverse, 2011

Research paper thumbnail of Sindacato, differenziali retributivi, riformismo. Paolo Santi: scritti e testimonianze

Sindacato, differenziali retributivi, riformismo. Paolo Santi: scritti e testimonianze

Raccolta dei principali scritti di Paolo Santi e sui temi del sindacato, dei differenziali retrib... more Raccolta dei principali scritti di Paolo Santi e sui temi del sindacato, dei differenziali retributivi e del riformismo. Nell'ultima sezione sono raccolte le testimonianze di chi ha condiviso con Paolo Santi circostanze di vita e di lavoro

Research paper thumbnail of Multivariate matching methods that are monotonic imbalance bounding

Multivariate matching methods that are monotonic imbalance bounding

Research paper thumbnail of How to Control for Bias in Social Media

How to Control for Bias in Social Media

Chapman and Hall/CRC eBooks, Jun 18, 2021

Research paper thumbnail of Random Recursive Partitioning: A Matching Method for the Estimation of the Average Treatment Effect

Social Science Research Network, 2006

In this paper we introduce the Random Recursive Partitioning (RRP) matching method. RRP generates... more In this paper we introduce the Random Recursive Partitioning (RRP) matching method. RRP generates a proximity matrix which might be useful in econometric applications like average treatment effect estimation. RRP is a Monte Carlo method that randomly generates non-empty recursive partitions of the data and evaluates the proximity between two observations as the empirical frequency they fall in a same cell of these random partitions over all Monte Carlo replications. From the proximity matrix it is possible to derive both graphical and analytical tools to evaluate the extent of the common support between data sets. The RRP method is "honest" in that it does not match observations "at any cost": if data sets are separated, the method clearly states it. The match obtained with RRP is invariant under monotonic transformation of the data. Average treatment effect estimators derived from the proximity matrix seem to be competitive compared to more commonly used estimators. RRP method does not require a particular structure of the data and for this reason it can be applied when distances like Mahalanobis or Euclidean are not suitable, in the presence of missing data or when the estimated propensity score is too sensitive to model specifications.

Research paper thumbnail of Do local subsidies to firms create jobs? Counterfactual evaluation of an Italian regional experience

Papers in Regional Science, Sep 20, 2017

CERTeT was established at the end of 1995. It focuses on the following research areas: territoria... more CERTeT was established at the end of 1995. It focuses on the following research areas: territorial economics (both regional and urban); transportation economics and the analysis of transport infrastructure (railways, highways, airways, waterways); the economics of tourism (the Centre organizes an advanced course on this subject); the evaluation of regional and local policies, paying specific attention to the use of EU structural funds; and the economics and management of water resources, and their role in local development. The New Working Paper Series circulates research in progress, policy notes, and discussions on economic issues falling within the competences of CERTeT. We host researchers working in "Regional Science" from inside Bocconi University as well as researchers belonging to the wide relational networks built up by CERTeT in its more than 20 years of activity.

Research paper thumbnail of Average treatment effect estimation via random recursive partitioning

RePEc: Research Papers in Economics, 2004

A new matching method is proposed for the estimation of the average treatment effect of social po... more A new matching method is proposed for the estimation of the average treatment effect of social policy interventions (e.g., training programs or health care measures). Given an outcome variable, a treatment and a set of pre-treatment covariates, the method is based on the examination of random recursive partitions of the space of covariates using regression trees. A regression tree is grown either on the treated or on the untreated individuals only using as response variable a random permutation of the indexes 1. . . n (n being the number of units involved), while the indexes for the other group are predicted using this tree. The procedure is replicated in order to rule out the effect of specific permutations. The average treatment effect is estimated in each tree by matching treated and untreated in the same terminal nodes. The final estimator of the average treatment effect is obtained by averaging on all the trees grown. The method does not require any specific model assumption apart from the tree's complexity, which does not affect the estimator though. We show that this method is either an instrument to check whether two samples can be matched (by any method) and, when this is feasible, to obtain reliable estimates of the average treatment effect. We further propose a graphical tool to inspect the quality of the match.

Research paper thumbnail of Random Recursive Partitiong and Rank-based proximities for data matching, missing data imputation and nonparametric classification and prediction

Random Recursive Partitiong and Rank-based proximities for data matching, missing data imputation and nonparametric classification and prediction

Data matching is a typical statistical problem in non experimental and/or observational studies o... more Data matching is a typical statistical problem in non experimental and/or observational studies or, more generally, in cross-sectional studies in which one or more data sets are to be compared. Several methods are available in the literature, most of which based on a particular metric or on statistical models, either parametric or nonparametric. We present two methods to calculate a proximity which have the property of being invariant under monotonic transformations. These methods require at most the notion of ordering. We provide an open-source software in the form of a R package. The software is available at: https://r-forge.r-project.org/projects/rrp/ See also: PORRO G., IACUS S.M (2008). Invariant and metric free proximities for data matching: an R package. JOURNAL OF STATISTICAL SOFTWARE., vol. 25 (11), p. 1-22, ISSN: 1548-766

Research paper thumbnail of How Sentiment Analysis of Social Networks Can Improve Our Knowledge of Citizens' Policy Preferences. An Application to Italy and France

The growing usage of social media by a wider audience of citizens sharply increases the possibili... more The growing usage of social media by a wider audience of citizens sharply increases the possibility to investigate the web as a device to explore and track policy preferences. In the present paper we apply the recent method proposed in to three different scenarios, by analyzing on one side the on-line popularity of Italian political leaders throughout 2011, and on the other the voting intention of French internet-users in both the 2012 Presidential ballot and in the subsequent Legislative election. Despite internet users are not necessarily representative of the whole population of country's citizens, our analysis shows a remarkable ability of social-media to forecast electoral results as well as a noteworthy correlation between social-media and traditional mass surveys results. We also illustrate that the predictive ability of social-media analysis strengthens as the number of citizens' expressing online their opinion increases, provided they act consistently on that (i.e. apart from high abstention rates).

Research paper thumbnail of Cem: Stata Module to Perform Coarsened Exact Matching

RePEc: Research Papers in Economics, 2008

Research paper thumbnail of Il lavoro interinale in Italia: uno sguardo all'oerta

Il lavoro interinale in Italia: uno sguardo all'oerta

RePEc: Research Papers in Economics, 2002

L'esame dell'archivio amministrativo di una delle maggiori società di fornitura... more L'esame dell'archivio amministrativo di una delle maggiori società di fornitura di lavoro interinale consente di osservare le specificità dell'offerta di lavoro temporaneo, rimaste in ombra negli studi realizzati finora sul lavoro temporaneo in Italia. I dati esaminati sono relativi alla fase di decollo delle società interinali (1998-2000). Attraverso un modello econometrico viene analizzato l'impatto delle caratteristiche dell'offerta (demografiche e di capitale umano) sulla probabilit`a di ottenere un avviamento al lavoro temporaneo. Evidenziando la rilevanza di alcune variabili relative al capitale umano e all'esperienza professionale, lo studio fa emergere il ruolodella società di fornitura nel promuovere l'occupabilità dei candidati al lavoro interinale. Alcuni dati sulle dinamiche salariali - lungi dal costituire una compiuta analisi degli aspetti retributivi del fenomeno interinale - confermano tuttavia le intuizioni dell'analisi econometrica.

Research paper thumbnail of Formazione e percorsi lavorativi dei laureati dell'Università degli Studi di Milano (IIa edizione: laureati 1999)

RePEc: Research Papers in Economics, 2004

Research paper thumbnail of Don't Ask, Just Listen … Using Social Networks to Measure Subjective Well-Being

Significance, May 24, 2022

Research paper thumbnail of Extracting Subjective Well-Being from Textual Data

Extracting Subjective Well-Being from Textual Data

Chapman and Hall/CRC eBooks, Jun 18, 2021

Research paper thumbnail of Measuring Social Well Being in The Big Data Era: Asking or Listening?

arXiv (Cornell University), Dec 22, 2015

The literature on well being measurement seems to suggest that "asking" for a self-evaluation is ... more The literature on well being measurement seems to suggest that "asking" for a self-evaluation is the only way to estimate a complete and reliable measure of well being. At the same time "not asking" is the only way to avoid biased evaluations due to self-reporting. Here we propose a method for estimating the welfare perception of a community simply "listening" to the conversations on Social Network Sites. The Social Well Being Index (SWBI) and its components are proposed through to an innovative technique of supervised sentiment analysis called iSA which scales to any language and big data. As main methodological advantages, this approach can estimate several aspects of social well being directly from self-declared perceptions, instead of approximating it through objective (but partial) quantitative variables like GDP; moreover self-perceptions of welfare are spontaneous and not obtained as answers to explicit questions that are proved to bias the result. As an application we evaluate the SWBI in Italy through the period 2012-2015 through the analysis of more than 143 millions of tweets.