Jackknife Research Papers - Academia.edu (original) (raw)
Abstract One of the major problem facing the data modelling at social area is multicollinearity. Multicollinearity can have significant impact on the quality and stability of the fitted regression model. Common classical regression... more
Abstract One of the major problem facing the data modelling at social area is multicollinearity. Multicollinearity can have significant impact on the quality and stability of the fitted regression model. Common classical regression technique by using Least Squares estimate is highly sensitive to multicollinearity problem. In such a problem area, Partial Least Squares Regression (PLSR) is a useful and flexible tool for statistical model building; however, PLSR can only yields point estimations. This paper will construct the interval estimations for PLSR regression parameters by implementing Jackknife technique to poverty data. A SAS macro programme is developed to obtain the Jackknife interval estimator for PLSR.
The scattered literature on Spearman's footrule and Gini's gamma is surveyed. The following topics are covered: finite-sample moments and asymptotic distribution under independence; large-sample distribution under arbitrary alternatives;... more
The scattered literature on Spearman's footrule and Gini's gamma is surveyed. The following topics are covered: finite-sample moments and asymptotic distribution under independence; large-sample distribution under arbitrary alternatives; asymptotic relative efficiency for testing independence; consistent asymptotic variance estimation through the jackknife; multivariate generalisations and uses. Complementary results and an extensive bibliography are provided, along with several original illustrations.
In this paper, the problem of estimation of variance has been considered when the missing data have been imputed with the ratio method of imputation. Two alternative variance estimators have been proposed. Relative efficiencies of the... more
In this paper, the problem of estimation of variance has been considered when the missing data have been imputed with the ratio method of imputation. Two alternative variance estimators have been proposed. Relative efficiencies of the proposed estimators have been compared with the existing alternative Jackknife estimator proposed by Variance estimation under two-phase sampling with application to imputation for missing data. Biometrika 82, 453-460] through empirical study. Empirical investigation shows that the proposed estimators fare better than the Variance estimation under two-phase sampling with application to imputation for missing data. Biometrika 82, 453-460] estimator. r
Evaluation of the growth of Callinectes sapidus (Decapoda: Portunidae) by the use of length-based methods based on size in Tamaulipas, Mexico. The capture blue crab (Callinectes sapidus) is one of the major fisheries of the state of... more
Evaluation of the growth of Callinectes sapidus (Decapoda: Portunidae) by the use of length-based methods based on size in Tamaulipas, Mexico. The capture blue crab (Callinectes sapidus) is one of the major fisheries of the state of Tamaulipas, Mexico; both in volume and selling price, as well as employment generation , but there is little information on its biological characteristics. The aim of this study was to evaluate the growth parameters of the blue crab, establishing the most appropriate method. We estimated the length frequency of 17 814 crabs from commercial catch of thirteen locations, including four coastal lagoons. The lagoons were El Barril, Madre, Morales and San Andrés from Tamaulipas, State. Growth parameters were evaluated using indirect methods ELEFAN, PROJMAT and SLCA in combination with the jackknife technique to establish the uncertainty of estimates inherent in each method. The growth parameters L ∞ and k were consolidated for purposes of comparison with the growth index phi prime (Φ'). With a mode of 110 mm, the interval carapace length varied between 60 and 205 mm. The values of the growth parameters varied according to the method used. Using SLCA, L ∞ varied between 259 and 260 mm and k ranged between 0.749 and 0.750 /year; with PROJMAT, L ∞ recorded values between 205 and 260 mm, k fluctuated between 0.550 and 0.740/year, and with ELEFAN, L ∞ ranged between 156 and 215 mm and k varied between 0.479 and 0.848/year. Estimates by jackknife detected no variability in Φ' between locations and significant differences between methods. The ranges of values of Φ' and PROJMAT estimated SLCA (4.70 to 4.71 and 4.66 to 4.70, respectively) were in the range reported in the literature (4.201-4.798), while lower values ELEFAN contributed significantly (3.87 to 4.27). The SLCA and PROJMAT methods in combination with the jackknife technique, proved to be the most suitable to estimate the growth parameters of C. sapidus. Rev. Biol. Trop. 64 (2): 821-836. Epub 2016 June 01.
We propose and validate a new sampling method to assess the presence, abundance and distribution of macrophytes in circular-shaped lakes according to the requirements of the Water Framework Directive (WFD2000/60/EC). The results of the... more
We propose and validate a new sampling method to assess the presence, abundance and distribution of macrophytes in circular-shaped lakes according to the requirements of the Water Framework Directive (WFD2000/60/EC). The results of the macrophyte survey, and in particular of macrophyte diversity, obtained using this method are also discussed.
The efficiency of four nonparametric species richness estimators -first-order Jackknife, second-order Jackknife, Chao2 and Bootstrapwas tested using simulated quadrat sampling of two field data sets (a sandy 'Dune' and adjacent 'Swale')... more
The efficiency of four nonparametric species richness estimators -first-order Jackknife, second-order Jackknife, Chao2 and Bootstrapwas tested using simulated quadrat sampling of two field data sets (a sandy 'Dune' and adjacent 'Swale') in high diversity shrublands (kwongan) in south-western Australia. The data sets each comprised > 100 perennial plant species and > 10 000 individuals, and the explicit ( x-y coordinate) location of every individual. We applied two simulated sampling strategies to these data sets based on sampling quadrats of unit sizes 1/400th and 1/100th of total plot area. For each site and sampling strategy we obtained 250 independent sample curves, of 250 quadrats each, and compared the estimators' performances by using three indices of bias and precision: MRE (mean relative error), MSRE (mean squared relative error) and OVER (percentage overestimation). The analysis presented here is unique in providing sample estimates derived from a complete, field-based population census for a high diversity plant community. In general the true reference value was approached faster for a comparable area sampled for the smaller quadrat size and for the swale field data set, which was characterized by smaller plant size and higher plant density. Nevertheless, at least 15 -30% of the total area needed to be sampled before reasonable estimates of S t (total species richness) were obtained. In most field surveys, typically less than 1% of the total study domain is likely to be sampled, and at this sampling intensity underestimation is a problem. Results showed that the second-order Jackknife approached the actual value of S t more quickly than the other estimators. All four estimators were better than S obs (observed number of species). However, the behaviour of the tested estimators was not as good as expected, and even with large sample size (number of quadrats sampled) all of them failed to provide reliable estimates. First-and secondorder Jackknives were positively biased whereas Chao2 and Bootstrap were negatively biased. The observed limitations in the estimators' performance suggests that there is still scope for new tools to be developed by statisticians to assist in the estimation of species richness from sample data, especially in communities with high species richness.
The cabbage aphid, Brevicoryne brassicae L. (Hemiptera: Aphididae), is a key pest of oilseed rape, Brassica napus L., and produces economic damage. Commonly, life table parameters have been used to compare insect Þtness on different... more
The cabbage aphid, Brevicoryne brassicae L. (Hemiptera: Aphididae), is a key pest of oilseed rape, Brassica napus L., and produces economic damage. Commonly, life table parameters have been used to compare insect Þtness on different varieties. Effects of four different varieties of oilseed rape (ÔZarfamÕ, ÔLicordÕ, ÔHyola 401Õ, and ÔSLM046Õ) on biological aspects and fecundity life table parameters of B. brassicae were studied under laboratory conditions. There was no signiÞcant difference between the length of prereproductive, reproductive, postreproductive periods, and fecundity of aphid developing on different varieties. The maximum length of prereproductive, reproductive, and longevity periods of B. brassicae were observed in the Licord variety. The maximum fecundity of aphid was recorded on Hyola 401. The intrinsic rate of increase was estimated by using EulerÐLotka equation and was compared with those estimated using Wyatt and WhiteÕs equation. Intrinsic rate of increase of B. brassicae (based on jackknife method) was 0.298, 0.294, 0.311, and 0.289 on Zarfam, Licord, Hyola 401, and SLM046, respectively. Feeding on SLM046 and Licord reduced the reproductive capacity of the cabbage aphid compared with the other varieties studied. However, statistical analysis of Wyatt and White output showed that calculated r m were higher than that EulerÐLotka equation and there was not signiÞcant difference between r m on different varieties. Aphid population growth 20 d after plant infestation showed no signiÞcant difference between numbers of produced aphids, but signiÞcant difference in the number of winged aphids.
This paper gives a relatively simple, well behaved solution to the problem of many instruments in heteroskedastic data. Such settings are common in microeconometric applications where many instruments are used to improve efficiency and... more
This paper gives a relatively simple, well behaved solution to the problem of many instruments in heteroskedastic data. Such settings are common in microeconometric applications where many instruments are used to improve efficiency and allowance for heteroskedasticity is generally important. The solution is a Fuller (1977) like estimator and standard errors that are robust to heteroskedasticity and many instruments. We show that the estimator has finite moments and high asymptotic efficiency in a range of cases. The standard errors are easy to compute, being like White's (1982), with additional terms that account for many instruments. They are consistent under standard, many instrument, and many weak instrument asymptotics. Based on a series of Monte Carlo experiments, we find that the estimators perform as well as LIML or Fuller (1977) under homoskedasticity, and have much lower bias and dispersion under heteroskedasticity, in nearly all cases considered.
We propose a jackknife minimum distance estimator designed to reduce the finite-sample bias of the optimal minimum distance estimator. Monte Carlo results indicate that our jackknife minimum distance estimator is a promising alternative... more
We propose a jackknife minimum distance estimator designed to reduce the finite-sample bias of the optimal minimum distance estimator. Monte Carlo results indicate that our jackknife minimum distance estimator is a promising alternative to existing minimum distance procedures.
The computer program IDENTIX estimates relatedness in natural populations using multilocus genotypic data. Queller & Goodnight's (1989) and Lynch & Ritland's (1999) estimators of pairwise relatedness are implemented, as well as the... more
The computer program IDENTIX estimates relatedness in natural populations using multilocus genotypic data. Queller & Goodnight's (1989) and Lynch & Ritland's (1999) estimators of pairwise relatedness are implemented, as well as the identity index of Mathieu et al . (1990). Estimates of the confidence intervals around these pairwise values are also provided. The null hypothesis of no relatedness (multilocus genotypes are independent draws from a panmictic population) is tested using a permutation method that compares the observed distribution of the moments of pairwise relatedness coefficients to that expected in unstructured population.
Class of life distributions which are new better than used in convex ordering (NBUC) is dealt with. A probabilistic characterization is introduced to measure the degree of NBUC-ness. A nonparametric procedure is also developed to test the... more
Class of life distributions which are new better than used in convex ordering (NBUC) is dealt with. A probabilistic characterization is introduced to measure the degree of NBUC-ness. A nonparametric procedure is also developed to test the exponentiality against the strict NBUC property, therein, the theory of U-statistics and jackknife is utilized to establish the asymptotic normality of the test statistic. Furthermore, Edgeworth expansion and bootstrap are employed to improve the accuracy of the approximation. Some numerical simulations on the power are presented as a demonstration for the proposed procedure.
A composite approach to point transect sampling is adopted in which the properties of the abundance estimator stem from the scheme performed to select points on the study area as well as from the modelled probabilities of detection. A... more
A composite approach to point transect sampling is adopted in which the properties of the abundance estimator stem from the scheme performed to select points on the study area as well as from the modelled probabilities of detection. A consistent and asymptotically normal estimator of abundance is derived under some assumptions about the detection function, while a multivariate generalization of this result is provided in multi-species surveys to estimate the species abundance vector. Jackknife estimators of diversity indices are also derived as functions of species abundance estimators. An application is considered for comparing the ecological diversity of avian communities in short rotation forestry versus traditional crops.
Theoretical constraints on economic model parameters often are in the form of inequality restrictions. For example, many theoretical results are in the form of monotonicity or nonnegativity restrictions. Inequality constraints can... more
Theoretical constraints on economic model parameters often are in the form of inequality restrictions. For example, many theoretical results are in the form of monotonicity or nonnegativity restrictions. Inequality constraints can truncate sampling distributions of parameter estimators, so that asymptotic normality no longer is possible. Sampling theoretic asymptotic inference is thereby greatly complicated or compromised. We use numerical methods to investigate the resulting sampling properties of inequality-constrained estimators produced by popular methods of imposing inequality constraints, with particular emphasis on the method of squaring, which is the most widely used method in the applied literature on estimating integrable neoclassical systems of demand equations. See Barnett and Binner (2004).
Systematists expect their hypotheses to be asymptotically precise. As the number of phylogenetically informative characters for a set of taxa increases, the relationships implied should stabilize on some topology. If true, this increasing... more
Systematists expect their hypotheses to be asymptotically precise. As the number of phylogenetically informative characters for a set of taxa increases, the relationships implied should stabilize on some topology. If true, this increasing stability should clearly manifest itself if an index of congruence is plotted against the accumulating number of characters. Continuous jackknife function (CJF) analysis is a new graphical method that portrays the extent to which available data converge on a specified phylogenetic hypothesis, the reference tree. The method removes characters with increasing probability, analyzes the rarefied data matrices phylogenetically, and scores the clades shared between each of the resulting trees and the reference tree. As more characters are removed, the number of shared clades must decrease, but the rate of decrease will depend on how decisively the data support the reference tree. Curves for stable phylogenies are clearly asymptotic with nearly 100% congruence for a substantial part of the curve. Less stable phylogenies lose congruent nodes quickly as characters are excluded, resulting in a more linear or even a sigmoidal relationship. Curves can be interpreted as predictors of whether the addition of new data of the same type is likely to alter the hypothesis under test. Continuous jackknife function analysis makes statistical assumptions about the collection of character data. To the extent that CJF curves are sensitive to violations of unbiased character collection, they will be misleading as predictors. Convergence of data on a reference tree does not guarantee historical accuracy, but it does predict that the accumulation of further data under the sampling model will not lead to rapid changes in the hypothesis.
A method for validation of the reference set in Soft Independent Modelling of Class Analogies (SIMCA) is proposed. The reference set is used to build the SIMCA model and the remaining samples are fitted to this model. Thus, it is... more
A method for validation of the reference set in Soft Independent Modelling of Class Analogies (SIMCA) is proposed. The reference set is used to build the SIMCA model and the remaining samples are fitted to this model. Thus, it is important that the reference set is representative for the reference class. In this work it is suggested that the reference set can be validated by the jackknife procedure. The jackknife estimate of standard error for the reference set is determined by successively leaving one sample out. It is proposed that the standard error should be minimised for an optimal reference set. Minimisation of the standard error should be balanced with the loss of variation span for the reference set to avoid a too narrow reference class. The reference sets are optimised by changing the composition of the reference set.
The precise knowledge of the number and nature of the species belonging to a fossil assemblage as well as of the structure of each species (e.g., age, sex) is of great importance in paleontology. Mixture analysis based on the method of... more
The precise knowledge of the number and nature of the species belonging to a fossil assemblage as well as of the structure of each species (e.g., age, sex) is of great importance in paleontology. Mixture analysis based on the method of maximum likelihood is a modern statistical technique that concerns the problem of samples consisting of several components, the composition of which is not known. Nonparametric bootstrap and jackknife techniques are used to calculate a confidence interval for each estimated parameter (prior probability, mean, standard deviation) of each group. The bootstrap method is also used to evaluate mathematically how many groups are present in a sample. Experimental density smoothing using the kernel method appears to be a better solution than the use of histograms for the estimation of a distribution. This paper presents some basic concepts and procedures and discusses some preliminary results concerning sex ratios and mortality profile assessments using bones and tooth metric data of small (Ovis antiqua) and large (Bos primigenius) bovines from European Pleistocene sites.
In this paper, we extended a parallel system survival model based on the bivariate exponential to incorporate a time varying covariate. We calculated the bias, standard error and rmse of the parameter estimates of this model at different... more
In this paper, we extended a parallel system survival model based on the bivariate exponential to incorporate a time varying covariate. We calculated the bias, standard error and rmse of the parameter estimates of this model at different censoring levels using simulated data. We then compared the difference in the total error when a fixed covariate model was used instead of the true time varying covariate model. Following that, we studied three methods of constructing confidence intervals for such models and conclusions were drawn based on the results of the coverage probability study. Finally, the results obtained by fitting the diabetic retinopathy study data to the model were analysed.
Resampling methods offers effective estimates of parameters and its asymptotic distribution. In this study, it is recommended to use the bootstrap method as an alternative to the classical and knife (one exclusion procedure) test... more
Resampling methods offers effective estimates of parameters and its asymptotic distribution. In this study, it is recommended to use the bootstrap method as an alternative to the classical and knife (one exclusion procedure) test statistics in evaluating the significance of the Pearson correlation coefficient by applying the bootstrap method to the simple linear regression model. This procedure provides an effective alternative to test the significance of the Pearson correlation coefficient. In the application, the model parameters, standard errors, Pearson coefficients of correlation, bias and % 95 confidence intervals belonging to bootstrap and jackknife methods in estimated with the help of a real data and the obtained results are interpreted. As a result, the test statistic obtained by the bootstrap method is proposed as an alternative to the classical and jackknife test statistics.
Several jackknife estimators of a relative risk in a single 2 Â 2 contingency table and of a common relative risk in a 2 Â 2 Â K contingency table are presented. The estimators are based on the maximum likelihood estimator in a single... more
Several jackknife estimators of a relative risk in a single 2 Â 2 contingency table and of a common relative risk in a 2 Â 2 Â K contingency table are presented. The estimators are based on the maximum likelihood estimator in a single table and on an estimator proposed by Tarone (1981) for strati®ed samples, respectively. For the strati®ed case, a sampling scheme is assumed where the number of observations within each table tends to in®nity but the number of tables remains ®xed. The asymptotic properties of the above estimators are derived. Especially, we present two general results which under certain regularity conditions yield consistency and asymptotic normality of every jackknife estimator of a bunch of functions of binomial probabilities.
Several techniques for resampling dependent data have already been proposed. In this paper we use missing values techniques to modify the moving blocks jackknife and bootstrap. More specifically, we consider the blocks of deleted... more
Several techniques for resampling dependent data have already been proposed. In this paper we use missing values techniques to modify the moving blocks jackknife and bootstrap. More specifically, we consider the blocks of deleted observations in the blockwise jackknife as missing data which are recovered by missing values estimates incorporating the observation dependence structure. Thus, we estimate the variance of a statistic as a weighted sample variance of the statistic evaluated in a “complete” series. Consistency of the variance and the distribution estimators of the sample mean are established. Also, we apply the missing values approach to the blockwise bootstrap by including some missing observations among two consecutive blocks and we demonstrate the consistency of the variance and the distribution estimators of the sample mean. Finally, we present the results of an extensive Monte Carlo study to evaluate the performance of these methods for finite sample sizes, showing tha...
Theoretical constraints on economic-model parameters often are in the form of inequality restrictions. For example, many theoretical results are in the form of monotonicity or nonnegativity restrictions. Inequality constraints can... more
Theoretical constraints on economic-model parameters often are in the form of inequality restrictions. For example, many theoretical results are in the form of monotonicity or nonnegativity restrictions. Inequality constraints can truncate sampling distributions of parameter estimators, so that asymptotic normality no longer is possible. Sampling theoretic asymptotic inference is thereby greatly complicated or compromised. We use numerical methods to investigate the resulting sampling properties of inequality constrained estimators produced by popular methods of imposing inequality constraints. In particular, we investigate the possible bias in the asymptotic standard errors of estimators of inequality constrained estimators, when the constraint is imposed by the popular method of squaring. That approach is known to violate a regularity condition in the available asymptotic proofs regarding the unconstrained estimator, since the sign of the unconstrained estimator, prior to squaring...
Usually, genetic correlations are estimated from breeding designs in the laboratory or greenhouse. However, estimates of the genetic correlation for natural populations are lacking, mostly because pedigrees of wild individuals are rarely... more
Usually, genetic correlations are estimated from breeding designs in the laboratory or greenhouse. However, estimates of the genetic correlation for natural populations are lacking, mostly because pedigrees of wild individuals are rarely known. Recently Lynch (1999) proposed a formula to estimate the genetic correlation in the absence of data on pedigree. This method has been shown to be particularly accurate provided a large sample size and a minimum (20%) proportion of relatives. proposed the use of the bootstrap to estimate standard errors associated with genetic correlations, but did not test the reliability of such a method. We tested the bootstrap and showed the jackknife can provide valid estimates of the genetic correlation calculated with the Lynch formula. The occurrence of undefined estimates, combined with the high number of replicates involved in the bootstrap, means there is a high probability of obtaining a biased upward, incomplete bootstrap, even when there is a high fraction of related pairs in a sample. It is easier to obtain complete jackknife estimates for which all the pseudovalues have been defined. We therefore recommend the use of the jackknife to estimate the genetic correlation with the Lynch formula. Provided data can be collected for more than two individuals at each location, we propose a group sampling method that produces low standard errors associated with the jackknife, even when there is a low fraction of relatives in a sample.
Small area is an area with insufficient sample for direct estimation. Limited survey objects, cause direct estimation can not produce better parameter estimates. Based on this, an indirect estimation method called empirical Bayes is used... more
Small area is an area with insufficient sample for direct estimation. Limited survey objects, cause direct estimation can not produce better parameter estimates. Based on this, an indirect estimation method called empirical Bayes is used to obtain a better estimate. This study will compare means squared error by direct estimation method and empirical Bayes method to find a better method on a small area. Jackknife is used to get the means squared error in the empirical Bayes. The results is, empirical Bayes methods give a better parameters based on mean squared errors. Empirical Bayes can produce a smaller mean squared error more than direct estimation in small area.
- by Bagus Oka
- •
- Mathematics, Jackknife
Paleontology can provide a deep-time dimension to observations about recent reactions of small mammals to climate change. Obtaining this perspective for voles (Microtus), a common and important constituent of North American mammal... more
Paleontology can provide a deep-time dimension to observations about recent reactions of small mammals to climate change. Obtaining this perspective for voles (Microtus), a common and important constituent of North American mammal communities, has been difficult because species identification based on their dental remains is problematic. Here I demonstrate that geometric morphometrics and discriminant analyses can use commonly fossilized dental features to identify the 5 extant species of Microtus in California: M. californicus (California vole), M. longicaudus (long-tailed vole), M. montanus (montane vole), M. oregoni (Oregon vole), and M. townsendii (Townsend's vole). Analyses of landmarks on the lower 1st molar (m1) provide more accurate identification than those of the 3rd upper molar (M3), and it is important to use jackknife misidentification metrics to assess the precision of discriminant analyses. Addition of semilandmark curves on m1 does not improve accuracy. The utility of these techniques is demonstrated by identifying Microtus specimens from 2 California fossil localities, Pacheco 2 and Prune Avenue, which provides the first evidence for extralimital presence of M. longicaudus at both localities. The presence of M. longicaudus at these low-elevation sites indicates that pronounced geographic range shifts in this species that have been observed in California over the last 100 years also occurred during previous climate changes. Eventually it might be possible to ascertain whether current range shifts are exceeding those that typified responses to past climate changes.
In this article, I discuss the main approaches to resampling variance estimation in complex survey data: balanced repeated replication, the jackknife, and the bootstrap. Balanced repeated replication and the jackknife are implemented in... more
In this article, I discuss the main approaches to resampling variance estimation in complex survey data: balanced repeated replication, the jackknife, and the bootstrap. Balanced repeated replication and the jackknife are implemented in the Stata svy suite. The bootstrap for complex survey data is implemented by the bsweights command. I describe this command and provide working examples.
Abstract. In this article, I discuss the main approaches to resampling variance estimation in complex survey data: balanced repeated replication, the jackknife, and the bootstrap. Balanced repeated replication and the jackknife are... more
Abstract. In this article, I discuss the main approaches to resampling variance estimation in complex survey data: balanced repeated replication, the jackknife, and the bootstrap. Balanced repeated replication and the jackknife are implemented in the Stata svy suite. The bootstrap for complex survey data is implemented by the bsweights command. I describe this command and provide working examples.
The scattered literature on Spearman's footrule and Gini's gamma is surveyed. The following topics are covered: finite-sample moments and asymptotic distribution under independence; large-sample... more
The scattered literature on Spearman's footrule and Gini's gamma is surveyed. The following topics are covered: finite-sample moments and asymptotic distribution under independence; large-sample distribution under arbitrary alternatives; asymptotic relative efficiency for testing independence; consistent asymptotic variance estimation through the jackknife; multivariate generalisations and uses. Complementary results and an extensive bibliography are provided, along with several original illustrations.
Estimate the richness of a community with accuracy despite differences in sampling effort is a key aspect to monitoring high diverse ecosystems. We compiled a worldwide multitaxa database, comprising 185 communities, in order to study the... more
Estimate the richness of a community with accuracy despite differences in sampling effort is a key aspect to monitoring high diverse ecosystems. We compiled a worldwide multitaxa database, comprising 185 communities, in order to study the relationship between the percentage of species represented by one individual (singletons) and the intensity of sampling (number of individuals divided by the number of species sampled). The database was used to empirically adjust a correction factor to improve the performance of non-parametrical estimators under conditions of low sampling effort. The correction factor was tested on seven estimators (Chao1, Chao2, Jack1, Jack2, ACE, ICE and Bootstrap). The correction factor was able to reduce the bias of all estimators tested under conditions of undersampling, while converging to the original uncorrected values at higher intensities. Our findings led us to recommend the threshold of 20 individuals/species, or less than 21% of singletons, as a minimu...
Although attention has been given to obtaining reliable standard errors for the plugin estimator of the Gini index, all standard errors suggested until now are either complicated or quite unreliable. An approximation is derived for the... more
Although attention has been given to obtaining reliable standard errors for the plugin estimator of the Gini index, all standard errors suggested until now are either complicated or quite unreliable. An approximation is derived for the estimator by which it is expressed as a sum of IID random variables. This approximation allows us to develop a reliable standard error that is simple to compute. A simple but effective bias correction is also derived. The quality of inference based on the approximation is checked in a number of simulation experiments, and is found to be very good unless the tail of the underlying distribution is heavy. Bootstrap methods are presented which alleviate this problem except in cases in which the variance is very large or fails to exist. Similar methods can be used to find reliable standard errors of other indices which are not simply linear functionals of the distribution function, such as Sen's poverty index and its modification known as the Sen-Shorrocks-Thon index.
Fixed effects estimator of panel models can be severely biased because of the well-known incidental parameter problems. It is shown that such bias can be reduced as T grows with n by using an analytical bias correction or by using a panel... more
Fixed effects estimator of panel models can be severely biased because of the well-known incidental parameter problems. It is shown that such bias can be reduced as T grows with n by using an analytical bias correction or by using a panel jacknife. We describe both of these approaches. We consider asymptotics where n and T grow at the same rate as an approximation that allows us to compare bias properties. Under these asymptotics the bias corrected estimators are centered at the truth, whereas the Þxed effects estimator is not. This asymptotic theory shows the bias reduction given by the analytical or jacknife correction.
Data from ripening experiments of herring carried out at three Nordic fishery research institutions in the period 1992±1995 were collected and analyzed by multivariate analysis. The experiments were carried out at different times, with... more
Data from ripening experiments of herring carried out at three Nordic fishery research institutions in the period 1992±1995 were collected and analyzed by multivariate analysis. The experiments were carried out at different times, with different stocks as raw material, using different types of treatments and analyzed in different laboratories. The question considered here is whether these data can be assumed to be one homogeneous set of data pertaining to ripening of salted herring or whether data from different labs, stocks, etc. must be considered independently. This is of importance for further research into ripening processes with these and similar data. It is shown in this paper that all data can be considered as one homogeneous data set. This is verified using resampling where latent structures are compared between different sample sets. This is done indirectly by testing regression models, that have been developed on one sample set, on other sample sets. It is also done directly by monitoring the deviation in latent structure observed between different sample sets. No formal statistical test is developed for whether samples can be assumed to stem from the same population. Although this can easily be envisioned, it was exactly the need for a more intuitive and visual test that prompted this work, developing different exploration tools that visually make it clear how well the data can be assumed to derive from the same population. Subsequently analyzing the data as one homogeneous group provides new information about factors that govern the ripening of salted herring and can be used in new strategic research as well as in industrial practice.
Although attention has been given to obtaining reliable standard errors for the plugin estimator of the Gini index, all standard errors suggested until now are either complicated or quite unreliable. An approximation is derived for the... more
Although attention has been given to obtaining reliable standard errors for the plugin estimator of the Gini index, all standard errors suggested until now are either complicated or quite unreliable. An approximation is derived for the estimator by which it is expressed as a sum of IID random variables. This approximation allows us to develop a reliable standard error that is simple to compute. A simple but effective bias correction is also derived. The quality of inference based on the approximation is checked in a number of simulation experiments, and is found to be very good unless the tail of the underlying distribution is heavy. Bootstrap methods are presented which alleviate this problem except in cases in which the variance is very large or fails to exist. Similar methods can be used to find reliable standard errors of other indices which are not simply linear functionals of the distribution function, such as Sen's poverty index and its modification known as the Sen-Shorrocks-Thon index.
ABSTRACT A Monte Carlo study is used to examine the size and power of t tests formed using a variety of estimation procedures appropriate in the context of heteroskedasticity when there are no replicated observations. There are three main... more
ABSTRACT A Monte Carlo study is used to examine the size and power of t tests formed using a variety of estimation procedures appropriate in the context of heteroskedasticity when there are no replicated observations. There are three main results: (1) the ordinary least squares estimator is quite robust with respect to inference; (2) an estimated generalized least squares estimator, formed using a possibly-erroneous assumption that the functional form of the heteroskedasticity is multiplicative, has highest power among the estimators considered, but has a too-large size; and (3) the advantages of the jackknife do not appear until the degree of heteroskedasticity is unrealistically large
A veriÿable condition for a symmetric statistic to be a two-sample U -statistic is given. As an illustration, we characterize which linear rank statistics with two-sample regression constants are U -statistics. We also show that... more
A veriÿable condition for a symmetric statistic to be a two-sample U -statistic is given. As an illustration, we characterize which linear rank statistics with two-sample regression constants are U -statistics. We also show that invariance under jackkniÿng characterizes a two-sample U -statistic.