Susana Rubin - Academia.edu (original) (raw)

Papers by Susana Rubin

Research paper thumbnail of Small area estimation for business surveys

Proceedings of the Amer. Statist. Assoc. Section on …, 2006

In business surveys, data typically are skewed and the standard approach for small area estimatio... more In business surveys, data typically are skewed and the standard approach for small area estimation (SAE) based on linear mixed models lead to inefficient estimates. In this paper, we discuss SAE techniques for skewed data that are linear following a suitable ...

Research paper thumbnail of The Pseudo-EBLUP estimator for a weighted average with an application to the Canadian Survey of Employment, Payrolls and Hours

Journal of Survey Statistics and Methodology, 2016

The pseudo–Empirical Best Linear Unbiased Predictor (pseudo-EBLUP) was previously developed for s... more The pseudo–Empirical Best Linear Unbiased Predictor (pseudo-EBLUP) was previously developed for simple means under the basic nested error regression model with constant error variances. In this paper, we extend the estimator to the pseudo-EBLUP of weighted means under the one-fold nested error regression model with heteroscedastic errors. The extended pseudo-EBLUP estimator takes into account the survey weights and the economic weights that make up the weighted mean. We obtain a second-order approximation to the mean squared error (MSE) of the extended pseudo-EBLUP estimator and an unbiased estimator of the MSE also up to the second order. We illustrate the methodology using a synthetic population based on a sample from the Canadian Survey of Employment, Payrolls and Hours (SEPH): We compare the extended pseudo-EBLUP with other model-based and direct cross-sectional domain estimators in terms of design-based MSE, and compare the model-based MSE estimates with the Monte Carlo design–based MSE.

Research paper thumbnail of Confidence Bands for Quantile-Quantile Plots

Statistics & Risk Modeling, 1986

ABSTRACT

Research paper thumbnail of The proportional hazards model for survey data from independent and clustered super-populations

Journal of Multivariate Analysis, 2011

Research paper thumbnail of Some issues in the analysis of complex survey data

We are concerned about inference on a parameter of a stochastic model with an estimator using dat... more We are concerned about inference on a parameter of a stochastic model with an estimator using data from a complex sample. Classical sampling theory concerns

Research paper thumbnail of Proceedings of the Survey Methods Section SMALL AREA ESTIMATION TO STUDY THE IMPACT OF GLOBALIZATION

We study the feasibility of producing relevant statistics to measure the impact of globalization ... more We study the feasibility of producing relevant statistics to measure the impact of globalization in the Canadian economy. In this project we require means of key economic variables for the wholesale industry at the level of “Trade Group ” by “Province ” by “Globalization Indicators”. We consider small area estimation using area level models with random effects and use penalized splines in order to accommodate departures from linearity. We propose a new bootstrap method to estimate the mean squared errors of the small area estimators. We illustrate the methodology with data from a particular trade group.

Research paper thumbnail of On fitting the proportional hazards model to data from complex surveys

We use the weighted sample partial likelihood score (SPLS) function to fit the proportional hazar... more We use the weighted sample partial likelihood score (SPLS) function to fit the proportional hazards regression model to survey data with complex sampling designs. The sample maximum partial likelihood estimator is the solution of the sample partial likelihood score function. Many authors applied this method to fit survival survey data. Binder (1992) dealt with inference on the descriptive census population parameter, that is, design-based inference on the maximum partial likelihood estimate that could be calculated had a census been taken on the finite population. Lin (2000) gave a formal justification of Binder’s method under the super-population approach and dealt with inference on the model parameter. Neither Binder nor Lin provided conditions for the respective asymptotic results to hold. Rubin-Bleuer (2003) uses Lin’s (2000) set up of the super-population approach and develops counting process methodology for a joint design-model space to obtain, under stated sufficient model a...

Research paper thumbnail of Proceedings of the Survey Methods Section AN APPROXIMATION OF THE PARTIAL LIKELIHOOD SCORE IN A JOINT DESIGN-MODEL SPACE

An approximation to the Sample Partial Likelihood Score (SPLS) in design probability was first pr... more An approximation to the Sample Partial Likelihood Score (SPLS) in design probability was first proposed by Binder (1992) and then considered by Lin (2000), in order to derive asymptotic theory for the parameters of the proportional hazards regression model. However they did not provide conditions under which this property held. In this paper I use Lin’s (2000) set up of the super-population approach and develop counting process methodology for a joint design-model space, to obtain, under sufficient model and design conditions, a rigorous proof of the approximation to the SPLS proposed by Binder (1992).

Research paper thumbnail of Proceedings of the Survey Methods Section EVALUATION OF SMALL DOMAIN ESTIMATORS FOR THE SURVEY OF EMPLOYMENT PAYROLL AND HOURS

The Survey of Employment, Payroll and Hours provides monthly estimates of payroll, employment, pa... more The Survey of Employment, Payroll and Hours provides monthly estimates of payroll, employment, paid hours and earnings. The proposed new design uses survey and payroll deduction (administrative) data to produce generalized regression (GREG) estimators for average weekly earnings, which are approximately unbiased and have a controlled cv for pre-determined strata. As in most surveys, there are many domains of interest with small sample size for which the GREG estimators might have unacceptable large measures of error. In this paper, we extend the Pseudo-EBLUP method of You and Rao (2002) to the case of unequal error variances, and we study the relative performance of several cross-sectional small domain estimators using real and synthetic populations.

Research paper thumbnail of Small Area Estimation to Study the Impact of Globalization

We study the feasibility of producing relevant statistics to measure the impact of globalization ... more We study the feasibility of producing relevant statistics to measure the impact of globalization in the Canadian economy. In this project we require means of key economic variables for the wholesale industry at the level of “Trade Group” by “Province” by “Globalization Indicators”. We consider small area estimation using area level models with random effects and use penalized splines in order to accommodate departures from linearity. We propose a new bootstrap method to estimate the mean squared errors of the small area estimators. We illustrate the methodology with data from a particular trade group.

Research paper thumbnail of A Probabilistic Set-Up for Model and Design-Based Inference

Research paper thumbnail of Evaluation of Small Domain Estimators for the Survey of Employment Payroll and Hours

The Survey of Employment, Payroll and Hours provides monthly estimates of payroll, employment, pa... more The Survey of Employment, Payroll and Hours provides monthly estimates of payroll, employment, paid hours and earnings. The proposed new design uses survey and payroll deduction (administrative) data to produce generalized regression (GREG) estimators for average weekly earnings, which are approximately unbiased and have a controlled cv for pre-determined strata. As in most surveys, there are many domains of interest with small sample size for which the GREG estimators might have unacceptable large measures of error. In this paper, we extend the Pseudo-EBLUP method of You and Rao (2002) to the case of unequal error variances, and we study the relative performance of several cross-sectional small domain estimators using real and synthetic populations.

Research paper thumbnail of Some Issues in the Analysis of Complex Survey Data

We are concerned about inference on a parameter of a stochastic model with an estimator using dat... more We are concerned about inference on a parameter of a stochastic model with an estimator using data from a complex sample. Classical sampling theory concerns inferences for finite population parameters. H~jek (1960), Krewski and Rao (1981), Binder (1983) and others, studied and obtained results on the asymptotic properties of the sample estimator under simple random sample and some complex designs. On the other hand, Hartley and Silken (1975), Fuller (1975), Francisco and Fuller (1991) and others, studied the properties of the sample estimator with respect to a model parameter, some times called superpopulation parameter. They obtained asymptotic results for regression sample estimators using data from certain complex sampling designs. Underlying their set ups, there was the notion of a "superpopulation" defined on a probability space (~,F,P) and the finite population was a considered a realization of it for an outcome to e [Z . The observed sample would be the second phase...

Research paper thumbnail of On the role of energy and L2-methods in potential theory

Research paper thumbnail of An Approximation of the Partial Likelihood Score in a Joint Design-Model Space

Research paper thumbnail of Confidence bands for quantile-quantile plots and testing for normality

Research paper thumbnail of On the two-phase framework for joint model and design-based inference

We establish a mathematical framework that formally validates the two-phase "superpopulation view... more We establish a mathematical framework that formally validates the two-phase "superpopulation viewpoint" proposed by Hartley and Sielken (1975), by defining a product probability space which includes both the design space and the model space. We develop a general methodology that combines finite population sampling theory and classical theory of infinite population sampling to account for the underlying processes that produce the data. Key results in this article are: the sample estimator and the model statistic are asymptotically independent; if a sequence converges in design law, it also converges in the law of the product space; and the distribution theory of the sample estimating equation estimator around a super-population parameter. We also study the interplay between dependence and independence of random variables when viewed in the design space, the product space and the model space and apply it to show formally that under a "simple random sample without replacement" design, we can "ignore" the design and work on the realm of the model space, but that under "simple random sample with replacement" we cannot ignore the design.

Research paper thumbnail of Some Issues in the Estimation of Income Dynamics

Research paper thumbnail of An Approximation of the Partial Likelihood Score in a Joint Design-Model Space

An approximation to the Sample Partial Likelihood Score (SPLS) in design probability was first pr... more An approximation to the Sample Partial Likelihood Score (SPLS) in design probability was first proposed by Binder (1992) and then considered by Lin (2000), in order to derive asymptotic theory for the parameters of the proportional hazards regression model. However they did not provide conditions under which this property held. In this paper I use Lin's (2000) set up of

Research paper thumbnail of Confidence Bands for Quantile-Quantile Plots

Statistics & Risk Modeling, 1986

ABSTRACT

Research paper thumbnail of Small area estimation for business surveys

Proceedings of the Amer. Statist. Assoc. Section on …, 2006

In business surveys, data typically are skewed and the standard approach for small area estimatio... more In business surveys, data typically are skewed and the standard approach for small area estimation (SAE) based on linear mixed models lead to inefficient estimates. In this paper, we discuss SAE techniques for skewed data that are linear following a suitable ...

Research paper thumbnail of The Pseudo-EBLUP estimator for a weighted average with an application to the Canadian Survey of Employment, Payrolls and Hours

Journal of Survey Statistics and Methodology, 2016

The pseudo–Empirical Best Linear Unbiased Predictor (pseudo-EBLUP) was previously developed for s... more The pseudo–Empirical Best Linear Unbiased Predictor (pseudo-EBLUP) was previously developed for simple means under the basic nested error regression model with constant error variances. In this paper, we extend the estimator to the pseudo-EBLUP of weighted means under the one-fold nested error regression model with heteroscedastic errors. The extended pseudo-EBLUP estimator takes into account the survey weights and the economic weights that make up the weighted mean. We obtain a second-order approximation to the mean squared error (MSE) of the extended pseudo-EBLUP estimator and an unbiased estimator of the MSE also up to the second order. We illustrate the methodology using a synthetic population based on a sample from the Canadian Survey of Employment, Payrolls and Hours (SEPH): We compare the extended pseudo-EBLUP with other model-based and direct cross-sectional domain estimators in terms of design-based MSE, and compare the model-based MSE estimates with the Monte Carlo design–based MSE.

Research paper thumbnail of Confidence Bands for Quantile-Quantile Plots

Statistics & Risk Modeling, 1986

ABSTRACT

Research paper thumbnail of The proportional hazards model for survey data from independent and clustered super-populations

Journal of Multivariate Analysis, 2011

Research paper thumbnail of Some issues in the analysis of complex survey data

We are concerned about inference on a parameter of a stochastic model with an estimator using dat... more We are concerned about inference on a parameter of a stochastic model with an estimator using data from a complex sample. Classical sampling theory concerns

Research paper thumbnail of Proceedings of the Survey Methods Section SMALL AREA ESTIMATION TO STUDY THE IMPACT OF GLOBALIZATION

We study the feasibility of producing relevant statistics to measure the impact of globalization ... more We study the feasibility of producing relevant statistics to measure the impact of globalization in the Canadian economy. In this project we require means of key economic variables for the wholesale industry at the level of “Trade Group ” by “Province ” by “Globalization Indicators”. We consider small area estimation using area level models with random effects and use penalized splines in order to accommodate departures from linearity. We propose a new bootstrap method to estimate the mean squared errors of the small area estimators. We illustrate the methodology with data from a particular trade group.

Research paper thumbnail of On fitting the proportional hazards model to data from complex surveys

We use the weighted sample partial likelihood score (SPLS) function to fit the proportional hazar... more We use the weighted sample partial likelihood score (SPLS) function to fit the proportional hazards regression model to survey data with complex sampling designs. The sample maximum partial likelihood estimator is the solution of the sample partial likelihood score function. Many authors applied this method to fit survival survey data. Binder (1992) dealt with inference on the descriptive census population parameter, that is, design-based inference on the maximum partial likelihood estimate that could be calculated had a census been taken on the finite population. Lin (2000) gave a formal justification of Binder’s method under the super-population approach and dealt with inference on the model parameter. Neither Binder nor Lin provided conditions for the respective asymptotic results to hold. Rubin-Bleuer (2003) uses Lin’s (2000) set up of the super-population approach and develops counting process methodology for a joint design-model space to obtain, under stated sufficient model a...

Research paper thumbnail of Proceedings of the Survey Methods Section AN APPROXIMATION OF THE PARTIAL LIKELIHOOD SCORE IN A JOINT DESIGN-MODEL SPACE

An approximation to the Sample Partial Likelihood Score (SPLS) in design probability was first pr... more An approximation to the Sample Partial Likelihood Score (SPLS) in design probability was first proposed by Binder (1992) and then considered by Lin (2000), in order to derive asymptotic theory for the parameters of the proportional hazards regression model. However they did not provide conditions under which this property held. In this paper I use Lin’s (2000) set up of the super-population approach and develop counting process methodology for a joint design-model space, to obtain, under sufficient model and design conditions, a rigorous proof of the approximation to the SPLS proposed by Binder (1992).

Research paper thumbnail of Proceedings of the Survey Methods Section EVALUATION OF SMALL DOMAIN ESTIMATORS FOR THE SURVEY OF EMPLOYMENT PAYROLL AND HOURS

The Survey of Employment, Payroll and Hours provides monthly estimates of payroll, employment, pa... more The Survey of Employment, Payroll and Hours provides monthly estimates of payroll, employment, paid hours and earnings. The proposed new design uses survey and payroll deduction (administrative) data to produce generalized regression (GREG) estimators for average weekly earnings, which are approximately unbiased and have a controlled cv for pre-determined strata. As in most surveys, there are many domains of interest with small sample size for which the GREG estimators might have unacceptable large measures of error. In this paper, we extend the Pseudo-EBLUP method of You and Rao (2002) to the case of unequal error variances, and we study the relative performance of several cross-sectional small domain estimators using real and synthetic populations.

Research paper thumbnail of Small Area Estimation to Study the Impact of Globalization

We study the feasibility of producing relevant statistics to measure the impact of globalization ... more We study the feasibility of producing relevant statistics to measure the impact of globalization in the Canadian economy. In this project we require means of key economic variables for the wholesale industry at the level of “Trade Group” by “Province” by “Globalization Indicators”. We consider small area estimation using area level models with random effects and use penalized splines in order to accommodate departures from linearity. We propose a new bootstrap method to estimate the mean squared errors of the small area estimators. We illustrate the methodology with data from a particular trade group.

Research paper thumbnail of A Probabilistic Set-Up for Model and Design-Based Inference

Research paper thumbnail of Evaluation of Small Domain Estimators for the Survey of Employment Payroll and Hours

The Survey of Employment, Payroll and Hours provides monthly estimates of payroll, employment, pa... more The Survey of Employment, Payroll and Hours provides monthly estimates of payroll, employment, paid hours and earnings. The proposed new design uses survey and payroll deduction (administrative) data to produce generalized regression (GREG) estimators for average weekly earnings, which are approximately unbiased and have a controlled cv for pre-determined strata. As in most surveys, there are many domains of interest with small sample size for which the GREG estimators might have unacceptable large measures of error. In this paper, we extend the Pseudo-EBLUP method of You and Rao (2002) to the case of unequal error variances, and we study the relative performance of several cross-sectional small domain estimators using real and synthetic populations.

Research paper thumbnail of Some Issues in the Analysis of Complex Survey Data

We are concerned about inference on a parameter of a stochastic model with an estimator using dat... more We are concerned about inference on a parameter of a stochastic model with an estimator using data from a complex sample. Classical sampling theory concerns inferences for finite population parameters. H~jek (1960), Krewski and Rao (1981), Binder (1983) and others, studied and obtained results on the asymptotic properties of the sample estimator under simple random sample and some complex designs. On the other hand, Hartley and Silken (1975), Fuller (1975), Francisco and Fuller (1991) and others, studied the properties of the sample estimator with respect to a model parameter, some times called superpopulation parameter. They obtained asymptotic results for regression sample estimators using data from certain complex sampling designs. Underlying their set ups, there was the notion of a "superpopulation" defined on a probability space (~,F,P) and the finite population was a considered a realization of it for an outcome to e [Z . The observed sample would be the second phase...

Research paper thumbnail of On the role of energy and L2-methods in potential theory

Research paper thumbnail of An Approximation of the Partial Likelihood Score in a Joint Design-Model Space

Research paper thumbnail of Confidence bands for quantile-quantile plots and testing for normality

Research paper thumbnail of On the two-phase framework for joint model and design-based inference

We establish a mathematical framework that formally validates the two-phase "superpopulation view... more We establish a mathematical framework that formally validates the two-phase "superpopulation viewpoint" proposed by Hartley and Sielken (1975), by defining a product probability space which includes both the design space and the model space. We develop a general methodology that combines finite population sampling theory and classical theory of infinite population sampling to account for the underlying processes that produce the data. Key results in this article are: the sample estimator and the model statistic are asymptotically independent; if a sequence converges in design law, it also converges in the law of the product space; and the distribution theory of the sample estimating equation estimator around a super-population parameter. We also study the interplay between dependence and independence of random variables when viewed in the design space, the product space and the model space and apply it to show formally that under a "simple random sample without replacement" design, we can "ignore" the design and work on the realm of the model space, but that under "simple random sample with replacement" we cannot ignore the design.

Research paper thumbnail of Some Issues in the Estimation of Income Dynamics

Research paper thumbnail of An Approximation of the Partial Likelihood Score in a Joint Design-Model Space

An approximation to the Sample Partial Likelihood Score (SPLS) in design probability was first pr... more An approximation to the Sample Partial Likelihood Score (SPLS) in design probability was first proposed by Binder (1992) and then considered by Lin (2000), in order to derive asymptotic theory for the parameters of the proportional hazards regression model. However they did not provide conditions under which this property held. In this paper I use Lin's (2000) set up of

Research paper thumbnail of Confidence Bands for Quantile-Quantile Plots

Statistics & Risk Modeling, 1986

ABSTRACT