Larry Toothaker - Academia.edu (original) (raw)
Papers by Larry Toothaker
Educational and Psychological Measurement, Oct 1, 1974
Researchers are often in a dilemma as to whether parametric or nonparametric procedures should be... more Researchers are often in a dilemma as to whether parametric or nonparametric procedures should be cited when assumptions of the parametric methods are thought to be violated. Therefore, the Kruskal-Wallis test and the ANOVA F-test were empirically compared in terms of probability of a Type I error and power under various patterns of mean differences in combination with patterns of variance inequality, and patterns of sample size inequality. The Kruskal-Wallis test was found to be competitive with the ANOVA P-test in terms of alpha but not for power. Power of the Kruskal-Wallis test was grossly affected in all but one situation for nonstepwise mean differences when sample sizes and variances were negatively related and when small levels of significance were utilized. The ANOVA F-test, however, was found to be generally robust for the types of specified mean differences.
British Journal of Mathematical and Statistical Psychology, Nov 1, 1975
The non‐parametric Jonckheere procedure for testing ordered alternatives in the κ‐sample case is ... more The non‐parametric Jonckheere procedure for testing ordered alternatives in the κ‐sample case is described. The empirical Type I error rate and power are estimated for small samples under several conditions, including sampling from either uniform, normal or exponential populations with unequal variances. A large sample approximation is also studied. The estimated Type I error rate is generally within sampling bounds for the exact test; the large sample approximation is quite adequate when alpha is 0.05 or greater. Power values appear to be not only a function of location differences but also of population kurtosis. Unequal variances and unequal sample sizes have some effect on both the probability of a Type I error and the power for the specific situations studied.
Journal of educational statistics, 1983
Several methods have been proposed for the analysis of data from singlesubject research settings.... more Several methods have been proposed for the analysis of data from singlesubject research settings. This research focuses on the modifications of ANOVA-based tests proposed by Shine and Bower, a procedure that precedes the ANOVA F test by preliminary testing of within-phase lag one serial correlation and the one-way ANOVA as presented by Gentile, Roden and Klein. Monte Carlo simulation is used to investigate these tests with respect to robustness and power. Each test was analyzed under various patterns of serial correlation, various patterns of phase and trial means, normal and exponential distributions, and equal and unequal phase variances. The findings indicate that the probability of a Type I error for these ANOVA-based tests is seriously inflated by nonzero serial correlation. These tests, therefore, cannot be recommended for use with data that have nonzero serial correlation.
Journal of educational statistics, Sep 1, 1994
The ANOVA F and several nonparametric competitors for two-way designs were compared for empirical... more The ANOVA F and several nonparametric competitors for two-way designs were compared for empirical a and power. Simulation of 2 X 2, 2 X 4, and 4X4 designs was done with cell sizes of 5 and 10 when sampling from normal, exponential, and mixed normal distributions. Conservatism of both a and power in the presence of other nonnull effects was seen in the tests due to Puri and Sen (1985) and, to a lesser degree, in the rank transform tests (Conover & Iman, 1981). Tests by McSweeney (1967) and Hettmansperger (1984) had liberal a for some designs and distributions, especially for small n. The ANOVA F suffers from conservative a and power for the mixed normal distribution, but it is generally recommended.
Educational and Psychological Measurement, Dec 1, 1974
The present research compares the ANOVA F-test, the Kruskal-Wallis test, and the normal scores te... more The present research compares the ANOVA F-test, the Kruskal-Wallis test, and the normal scores test in terms of empirical alpha and empirical power with samples from the normal distribution and two exponential distributions. Empirical evidence supports the use of the ANOVA F-test even under violation of assumptions when testing hypotheses about means. If the researcher is willing to test hypotheses about medians, the Kruskal-Wallis test was found to be competitive to the F-test. However, in the cases investigated, the normal scores test was not consistently better than the F-test or the Kruskal-Wallis test and could not be recommended on the basis of this research.
Journal of the American Statistical Association, Mar 1, 1978
The Bartlett-Kendall test was investigated to determine the optimum: (1) subsample size (M) for e... more The Bartlett-Kendall test was investigated to determine the optimum: (1) subsample size (M) for equal nj where nj is evenly divisible; (2) strategy for equal nj where nj is not evenly divisible by number of observations per subsamples, K, K > 1; (3) strategy for unequal nj including two modifications. Results for (1) indicated that a choice of M which
SAGE Publications, Inc. eBooks, Apr 30, 2012
Journal of educational statistics, 1980
Extensions of the Kruskal-Wallis procedure for a factorial design are reviewed and researched und... more Extensions of the Kruskal-Wallis procedure for a factorial design are reviewed and researched under various degrees and kinds of nonnullity. It was found that the distributions of these test statistics are a function of effects other than those being tested except under the completely null situation and their use is discouraged.
SAGE Publications, Inc. eBooks, 1993
Various caes ot unequal var;:ances, and uhcqual sines trcr a nctnal and 6 sLev;eci y.oFulatiem v4... more Various caes ot unequal var;:ances, and uhcqual sines trcr a nctnal and 6 sLev;eci y.oFulatiem v4-re used to emtirically citaih the proratiility or a Type I e'..-ror and the power for the pereutaticn t-test as comprea to Stude:,t's t-test and the 'clann-Anitney h-tEFt. results showed dittErehc.cs rot dirIerEnt 6anile Sizes, vaii,nce ratto:3, porulation sag/led, and sizt ot mean or the poiulation. in lowe:r at the lermutation t-test is very close tc Cr yreater than that ot Studett's t-test for toLh parulations, and the Lower is lacy' it the :arc;,_ valiance acccmraniea the larye meal for the skewErd i.opalaticn. (Author) An Empirical Investigation of the Effect of Unequal Variances on the Permutation t-test*
Journal of the American Statistical Association, Sep 1, 1975
Abstract The harmonic mean and Kramer [13] unequal nk forms of the Tukey multiple comparison test... more Abstract The harmonic mean and Kramer [13] unequal nk forms of the Tukey multiple comparison test were compared for observed Type I error and correct decision rates. Sensitivity was evaluated under the restriction that the analysis of variance F-test be significant at α = 0.05, under numerous parametric specifications which represent behavioral research data. The data indicates that both procedures are adversely affected by combining unequal variances with unequal sample sizes and have the same sensitivity for detecting real mean differences.
Chapman and Hall/CRC eBooks, Mar 25, 2015
The Frontier between Knowledge and Ignorance Introduction The Context for Statistics: Science and... more The Frontier between Knowledge and Ignorance Introduction The Context for Statistics: Science and Research Definition of Statistics The Big Picture: Populations, Samples, and Variables Generalizing from the Sample to the Population Experimental Research Blinding and Randomized Block Design Nonexperimental Research Quasi-Experimental Research Inferences and Kinds of Validity Describing Distributions with Statistics: Middle, Spread, and Skewness Introduction Measures of Location Measures of Spread or Variability Measure of Skewness or Departure from Symmetry Exploring Data Visually Introduction Why Graph Our Data? Pie Charts and Bar Graphs Two Kinds of Dot Plots Scatterplots Histograms Time Plots (Line Graphs) Boxplots Graphs Can Be Misleading Beyond These Graphs Relative Location and Normal Distributions Introduction Standardizing Scores Computing a z Score in a Sample Computing a z Score in a Population Comparing z Scores for Different Variables A Different Kind of Standard Score Distributions and Proportions Areas under the Standard Normal Curve Bivariate Correlation Introduction Pearson's Correlation Coefficient Verbal Definition of Pearson's r Judging the Strength of a Correlation What Most Introductory Statistics Texts Say about Correlation Pearson's r Measures Linear Relationships Only Correlations Can Be Influenced by Outliers Correlations and Restriction of Range Combining Groups of Scores Can Affect Correlations Missing Data Are Omitted from Correlations Pearson's r Does Not Specify Which Variable Is the Predictor Probability and Risk Introduction Relative Frequency of Occurrence Conditional Probability Special Names for Certain Conditional Probabilities Statistics Often Accompanying Sensitivity and Specificity Two Other Probabilities: "And" and "Or" Risk and Relative Risk Other Statistics Associated with Probability Sampling Distributions and Estimation Introduction Quantifying Variability from Sample to Sample Kinds of Distributions Why We Need Sampling Distributions Comparing Three Distributions: What We Know So Far Central Limit Theorem Unbiased Estimators Standardizing the Sample Mean Interval Estimation Calculating a Confidence Interval Estimate of mu Hypothesis Testing and Interval Estimation Introduction Testable Guesses The Rat Shipment Story Overview of Hypothesis Testing Two Competing Statements about What May Be True Writing Statistical Hypotheses Directional and Nondirectional Alternative Hypotheses Choosing a Small Probability as a Standard Compute the Test Statistic and a Certain Probability Decision Rules When H1 Predicts a Direction Decision Rules When H1 Is Nondirectional Assumptions Testing Hypotheses with Confidence Intervals: Nondirectional H1 Testing Hypotheses with Confidence Intervals: Directional H1 Types of Errors and Power Introduction Possible Errors in Hypothesis Testing Probability of a Type I Error Probability of Correctly Retaining the Null Hypothesis Type I Errors and Confidence Intervals Probability of a Type II Error and Power Factors Influencing Power: Effect Size Factors Influencing Power: Sample Size Factors Influencing Power: Directional Alternative Hypotheses Factors Influencing Power: Significance Level Factors Influencing Power: Variability Factors Influencing Power: Relation to Confidence Intervals One-Sample Tests and Estimates Introduction One-Sample t Test Distribution for Critical Values and p Values Critical Values for the One-Sample t Test Completing the Sleep Quality Example Assumptions Confidence Interval for mu Using One-Sample t Critical Value Graphing Confidence Intervals and Sample Means Two-Sample Tests and Estimates Introduction Pairs of Scores and the Paired t Test Two Other Ways of Getting Pairs of Scores Fun Fact Associated with Paired Means Paired t Hypotheses When Direction Is Not Predicted Paired t Hypotheses When Direction Is Predicted Formula for the Paired t Test Confidence Interval for the Difference in Paired Means Comparing Means of Two Independent Groups Independent t Hypotheses When Direction Is Not Predicted Independent t Hypotheses When Direction Is Predicted Formula for the Independent-Samples t Test Assumptions Confidence Intervals for a Difference in Independent Means Limitations on Using the t Statistics in This Chapter Tests and Estimates for Two or More Samples Introduction Going beyond the Independent-Samples t Test Variance between Groups and Within Groups One-Way ANOVA F Test: Logic and Hypotheses Computing the One-Way ANOVA F Test Critical Values and Decision Rules Numeric Example of a One-Way ANOVA F Test Testing the Null Hypothesis Assumptions and Robustness How to Tell Which Group Is Best Multiple Comparison Procedures and Hypotheses Many Statistics Possible for Multiple Comparisons Confidence Intervals in a One-Way ANOVA Design Tests and Estimates for Bivariate Linear Relationships Introduction Hypothesizing about a Correlation Testing a Null Hypothesis about a Correlation Assumptions of Pearson's r Using a…
The area investigated in the present study is the comparison of the permutation t-test with Stude... more The area investigated in the present study is the comparison of the permutation t-test with Student's t-test and the Mann-Whitney U-test. The comparison was made for small samples for three distributions, including a normal distribution, a uniform distribution, and a skewed distribution. The properties of each test compared were the probability of a Type I error and the power against a location-shift alternative hypothesis. The present research indicates that the permutation t-test is an acceptable statistical procedure for the two-sample problem for the normal and uniform populations and suggests that it might be more desirable than the traditional Student's t-test when sample sizes are proportional to This Technical Report is a doctoral dissertation reporting research supported by the Wisconsin Research and Development Center for Cognitive Learning. Since it has been approved by a University Examining Committee, it has not been reviewed by the Center. It is published by the Center as a record of some of the Center's activities and as a service to the student. The bound original is in The University of Wisconsin Memorial Library.
Psychological Bulletin, Jul 1, 1973
Petrinovich and Hardyck's general conclusions that the Tukey and Scheffe multiple comparison stat... more Petrinovich and Hardyck's general conclusions that the Tukey and Scheffe multiple comparison statistics are not powerful tests is challenged. Data collected corroborates the correspondence between the sensitivity of the analysis of variance F test and the contrast that compares the maximum difference in a set of K means.
The Journal of the Operational Research Society, 1994
Page 1. MULTIPLE REGRESSION Testing and Interpreting Interactions Y '0 2 k * Zimean) Leona S... more Page 1. MULTIPLE REGRESSION Testing and Interpreting Interactions Y '0 2 k * Zimean) Leona S. Aiken Stephen G. West Page 2. Page 3. ...
Educational and Psychological Measurement, Oct 1, 1974
Researchers are often in a dilemma as to whether parametric or nonparametric procedures should be... more Researchers are often in a dilemma as to whether parametric or nonparametric procedures should be cited when assumptions of the parametric methods are thought to be violated. Therefore, the Kruskal-Wallis test and the ANOVA F-test were empirically compared in terms of probability of a Type I error and power under various patterns of mean differences in combination with patterns of variance inequality, and patterns of sample size inequality. The Kruskal-Wallis test was found to be competitive with the ANOVA P-test in terms of alpha but not for power. Power of the Kruskal-Wallis test was grossly affected in all but one situation for nonstepwise mean differences when sample sizes and variances were negatively related and when small levels of significance were utilized. The ANOVA F-test, however, was found to be generally robust for the types of specified mean differences.
British Journal of Mathematical and Statistical Psychology, Nov 1, 1975
The non‐parametric Jonckheere procedure for testing ordered alternatives in the κ‐sample case is ... more The non‐parametric Jonckheere procedure for testing ordered alternatives in the κ‐sample case is described. The empirical Type I error rate and power are estimated for small samples under several conditions, including sampling from either uniform, normal or exponential populations with unequal variances. A large sample approximation is also studied. The estimated Type I error rate is generally within sampling bounds for the exact test; the large sample approximation is quite adequate when alpha is 0.05 or greater. Power values appear to be not only a function of location differences but also of population kurtosis. Unequal variances and unequal sample sizes have some effect on both the probability of a Type I error and the power for the specific situations studied.
Journal of educational statistics, 1983
Several methods have been proposed for the analysis of data from singlesubject research settings.... more Several methods have been proposed for the analysis of data from singlesubject research settings. This research focuses on the modifications of ANOVA-based tests proposed by Shine and Bower, a procedure that precedes the ANOVA F test by preliminary testing of within-phase lag one serial correlation and the one-way ANOVA as presented by Gentile, Roden and Klein. Monte Carlo simulation is used to investigate these tests with respect to robustness and power. Each test was analyzed under various patterns of serial correlation, various patterns of phase and trial means, normal and exponential distributions, and equal and unequal phase variances. The findings indicate that the probability of a Type I error for these ANOVA-based tests is seriously inflated by nonzero serial correlation. These tests, therefore, cannot be recommended for use with data that have nonzero serial correlation.
Journal of educational statistics, Sep 1, 1994
The ANOVA F and several nonparametric competitors for two-way designs were compared for empirical... more The ANOVA F and several nonparametric competitors for two-way designs were compared for empirical a and power. Simulation of 2 X 2, 2 X 4, and 4X4 designs was done with cell sizes of 5 and 10 when sampling from normal, exponential, and mixed normal distributions. Conservatism of both a and power in the presence of other nonnull effects was seen in the tests due to Puri and Sen (1985) and, to a lesser degree, in the rank transform tests (Conover & Iman, 1981). Tests by McSweeney (1967) and Hettmansperger (1984) had liberal a for some designs and distributions, especially for small n. The ANOVA F suffers from conservative a and power for the mixed normal distribution, but it is generally recommended.
Educational and Psychological Measurement, Dec 1, 1974
The present research compares the ANOVA F-test, the Kruskal-Wallis test, and the normal scores te... more The present research compares the ANOVA F-test, the Kruskal-Wallis test, and the normal scores test in terms of empirical alpha and empirical power with samples from the normal distribution and two exponential distributions. Empirical evidence supports the use of the ANOVA F-test even under violation of assumptions when testing hypotheses about means. If the researcher is willing to test hypotheses about medians, the Kruskal-Wallis test was found to be competitive to the F-test. However, in the cases investigated, the normal scores test was not consistently better than the F-test or the Kruskal-Wallis test and could not be recommended on the basis of this research.
Journal of the American Statistical Association, Mar 1, 1978
The Bartlett-Kendall test was investigated to determine the optimum: (1) subsample size (M) for e... more The Bartlett-Kendall test was investigated to determine the optimum: (1) subsample size (M) for equal nj where nj is evenly divisible; (2) strategy for equal nj where nj is not evenly divisible by number of observations per subsamples, K, K > 1; (3) strategy for unequal nj including two modifications. Results for (1) indicated that a choice of M which
SAGE Publications, Inc. eBooks, Apr 30, 2012
Journal of educational statistics, 1980
Extensions of the Kruskal-Wallis procedure for a factorial design are reviewed and researched und... more Extensions of the Kruskal-Wallis procedure for a factorial design are reviewed and researched under various degrees and kinds of nonnullity. It was found that the distributions of these test statistics are a function of effects other than those being tested except under the completely null situation and their use is discouraged.
SAGE Publications, Inc. eBooks, 1993
Various caes ot unequal var;:ances, and uhcqual sines trcr a nctnal and 6 sLev;eci y.oFulatiem v4... more Various caes ot unequal var;:ances, and uhcqual sines trcr a nctnal and 6 sLev;eci y.oFulatiem v4-re used to emtirically citaih the proratiility or a Type I e'..-ror and the power for the pereutaticn t-test as comprea to Stude:,t's t-test and the 'clann-Anitney h-tEFt. results showed dittErehc.cs rot dirIerEnt 6anile Sizes, vaii,nce ratto:3, porulation sag/led, and sizt ot mean or the poiulation. in lowe:r at the lermutation t-test is very close tc Cr yreater than that ot Studett's t-test for toLh parulations, and the Lower is lacy' it the :arc;,_ valiance acccmraniea the larye meal for the skewErd i.opalaticn. (Author) An Empirical Investigation of the Effect of Unequal Variances on the Permutation t-test*
Journal of the American Statistical Association, Sep 1, 1975
Abstract The harmonic mean and Kramer [13] unequal nk forms of the Tukey multiple comparison test... more Abstract The harmonic mean and Kramer [13] unequal nk forms of the Tukey multiple comparison test were compared for observed Type I error and correct decision rates. Sensitivity was evaluated under the restriction that the analysis of variance F-test be significant at α = 0.05, under numerous parametric specifications which represent behavioral research data. The data indicates that both procedures are adversely affected by combining unequal variances with unequal sample sizes and have the same sensitivity for detecting real mean differences.
Chapman and Hall/CRC eBooks, Mar 25, 2015
The Frontier between Knowledge and Ignorance Introduction The Context for Statistics: Science and... more The Frontier between Knowledge and Ignorance Introduction The Context for Statistics: Science and Research Definition of Statistics The Big Picture: Populations, Samples, and Variables Generalizing from the Sample to the Population Experimental Research Blinding and Randomized Block Design Nonexperimental Research Quasi-Experimental Research Inferences and Kinds of Validity Describing Distributions with Statistics: Middle, Spread, and Skewness Introduction Measures of Location Measures of Spread or Variability Measure of Skewness or Departure from Symmetry Exploring Data Visually Introduction Why Graph Our Data? Pie Charts and Bar Graphs Two Kinds of Dot Plots Scatterplots Histograms Time Plots (Line Graphs) Boxplots Graphs Can Be Misleading Beyond These Graphs Relative Location and Normal Distributions Introduction Standardizing Scores Computing a z Score in a Sample Computing a z Score in a Population Comparing z Scores for Different Variables A Different Kind of Standard Score Distributions and Proportions Areas under the Standard Normal Curve Bivariate Correlation Introduction Pearson's Correlation Coefficient Verbal Definition of Pearson's r Judging the Strength of a Correlation What Most Introductory Statistics Texts Say about Correlation Pearson's r Measures Linear Relationships Only Correlations Can Be Influenced by Outliers Correlations and Restriction of Range Combining Groups of Scores Can Affect Correlations Missing Data Are Omitted from Correlations Pearson's r Does Not Specify Which Variable Is the Predictor Probability and Risk Introduction Relative Frequency of Occurrence Conditional Probability Special Names for Certain Conditional Probabilities Statistics Often Accompanying Sensitivity and Specificity Two Other Probabilities: "And" and "Or" Risk and Relative Risk Other Statistics Associated with Probability Sampling Distributions and Estimation Introduction Quantifying Variability from Sample to Sample Kinds of Distributions Why We Need Sampling Distributions Comparing Three Distributions: What We Know So Far Central Limit Theorem Unbiased Estimators Standardizing the Sample Mean Interval Estimation Calculating a Confidence Interval Estimate of mu Hypothesis Testing and Interval Estimation Introduction Testable Guesses The Rat Shipment Story Overview of Hypothesis Testing Two Competing Statements about What May Be True Writing Statistical Hypotheses Directional and Nondirectional Alternative Hypotheses Choosing a Small Probability as a Standard Compute the Test Statistic and a Certain Probability Decision Rules When H1 Predicts a Direction Decision Rules When H1 Is Nondirectional Assumptions Testing Hypotheses with Confidence Intervals: Nondirectional H1 Testing Hypotheses with Confidence Intervals: Directional H1 Types of Errors and Power Introduction Possible Errors in Hypothesis Testing Probability of a Type I Error Probability of Correctly Retaining the Null Hypothesis Type I Errors and Confidence Intervals Probability of a Type II Error and Power Factors Influencing Power: Effect Size Factors Influencing Power: Sample Size Factors Influencing Power: Directional Alternative Hypotheses Factors Influencing Power: Significance Level Factors Influencing Power: Variability Factors Influencing Power: Relation to Confidence Intervals One-Sample Tests and Estimates Introduction One-Sample t Test Distribution for Critical Values and p Values Critical Values for the One-Sample t Test Completing the Sleep Quality Example Assumptions Confidence Interval for mu Using One-Sample t Critical Value Graphing Confidence Intervals and Sample Means Two-Sample Tests and Estimates Introduction Pairs of Scores and the Paired t Test Two Other Ways of Getting Pairs of Scores Fun Fact Associated with Paired Means Paired t Hypotheses When Direction Is Not Predicted Paired t Hypotheses When Direction Is Predicted Formula for the Paired t Test Confidence Interval for the Difference in Paired Means Comparing Means of Two Independent Groups Independent t Hypotheses When Direction Is Not Predicted Independent t Hypotheses When Direction Is Predicted Formula for the Independent-Samples t Test Assumptions Confidence Intervals for a Difference in Independent Means Limitations on Using the t Statistics in This Chapter Tests and Estimates for Two or More Samples Introduction Going beyond the Independent-Samples t Test Variance between Groups and Within Groups One-Way ANOVA F Test: Logic and Hypotheses Computing the One-Way ANOVA F Test Critical Values and Decision Rules Numeric Example of a One-Way ANOVA F Test Testing the Null Hypothesis Assumptions and Robustness How to Tell Which Group Is Best Multiple Comparison Procedures and Hypotheses Many Statistics Possible for Multiple Comparisons Confidence Intervals in a One-Way ANOVA Design Tests and Estimates for Bivariate Linear Relationships Introduction Hypothesizing about a Correlation Testing a Null Hypothesis about a Correlation Assumptions of Pearson's r Using a…
The area investigated in the present study is the comparison of the permutation t-test with Stude... more The area investigated in the present study is the comparison of the permutation t-test with Student's t-test and the Mann-Whitney U-test. The comparison was made for small samples for three distributions, including a normal distribution, a uniform distribution, and a skewed distribution. The properties of each test compared were the probability of a Type I error and the power against a location-shift alternative hypothesis. The present research indicates that the permutation t-test is an acceptable statistical procedure for the two-sample problem for the normal and uniform populations and suggests that it might be more desirable than the traditional Student's t-test when sample sizes are proportional to This Technical Report is a doctoral dissertation reporting research supported by the Wisconsin Research and Development Center for Cognitive Learning. Since it has been approved by a University Examining Committee, it has not been reviewed by the Center. It is published by the Center as a record of some of the Center's activities and as a service to the student. The bound original is in The University of Wisconsin Memorial Library.
Psychological Bulletin, Jul 1, 1973
Petrinovich and Hardyck's general conclusions that the Tukey and Scheffe multiple comparison stat... more Petrinovich and Hardyck's general conclusions that the Tukey and Scheffe multiple comparison statistics are not powerful tests is challenged. Data collected corroborates the correspondence between the sensitivity of the analysis of variance F test and the contrast that compares the maximum difference in a set of K means.
The Journal of the Operational Research Society, 1994
Page 1. MULTIPLE REGRESSION Testing and Interpreting Interactions Y '0 2 k * Zimean) Leona S... more Page 1. MULTIPLE REGRESSION Testing and Interpreting Interactions Y '0 2 k * Zimean) Leona S. Aiken Stephen G. West Page 2. Page 3. ...