Gregory Francis | Purdue University (original) (raw)
Papers by Gregory Francis
PloS one, Apr 4, 2024
Nitric oxide (NO) is involved in a variety of biological functions including blood vessel dilatio... more Nitric oxide (NO) is involved in a variety of biological functions including blood vessel dilation and neurotransmitter release. In animals, NO has been demonstrated to affect multiple behavioral outcomes, such as memory performance and arousal, whereas this link is less explored in humans. NO is created in the paranasal sinuses and studies show that humming releases paranasal NO to the nasal tract and that NO can then cross the blood brain barrier. Akin to animal models, we hypothesized that this NO may traverse into the brain and positively affect information processing. In contrast to our hypothesis, an articulatory suppression memory paradigm and a speeded detection task found deleterious effects of humming while performing the task. Likewise, we found no effect of humming on emotional processing of photos. In a fourth experiment, participants hummed before each trial in a speeded detection task, but we again found no effect on response time. In conclusion, either nasal NO does not travel to the brain, or NO in the brain does not have the expected impact on cognitive performance and emotional processing in humans. It remains possible that NO influences other cognitive processes not tested for here.
Learning materials in biosciences, 2019
What You Will Learn in This Chapter In Chap. 3, we examined how to compare the means of two group... more What You Will Learn in This Chapter In Chap. 3, we examined how to compare the means of two groups. In this chapter, we will examine how to compare means of more than two groups. 6.1 One-Way Independent Measures ANOVA Suppose we wish to examine the effects of geographic region on tree heights. We might sample trees from near the equator, from the 49th parallel, and from the 60th parallel. We want to know whether the mean tree heights from all three regions are the same (Fig. 6.1). Since we have three regions we cannot use the t-test because the t-test only works if we are comparing two groups.
Journal of Vision, Sep 27, 2021
Perception, 2019
In crowding, perception of a target deteriorates in the presence of nearby flankers. In the tradi... more In crowding, perception of a target deteriorates in the presence of nearby flankers. In the traditional feedforward framework of vision, only elements within Bouma’s window interfere with the target and adding more elements always leads to stronger crowding. Crowding is usually studied with sparse displays that involve only a few flankers, as too many flankers lead to a combinatorial explosion of display configurations. To deal with these challenges, Van der Burg et al. (2017) proposed a paradigm to measure crowding in dense displays using genetic algorithms. In their study, displays were selected and combined over several generations to maximize human performance. Van der Burg et al. found that only the target’s nearest neighbours affect performance in their displays. Here, we used the same paradigm, but the displays were selected according to the performance of visual crowding models. We found that all models based on the traditional framework of vision tested so far produce results in which all elements within Bouma’s window affect performance in dense displays, contrary to human behavior. The only model that explains the results of Van der Burg et al. has a dedicated grouping process. We conclude that a grouping stage is crucial to understand visual processing
Frontiers in Psychology, Sep 22, 2016
His primary research interest is to develop and test neural network models of visual perception.
Psychonomic Bulletin & Review, Jan 19, 2021
It is common for conclusions of empirical studies to depend on multiple significant outcomes. Thi... more It is common for conclusions of empirical studies to depend on multiple significant outcomes. This practice may seem reasonable, but it has some unintended effects. In particular, the compound Type I error rate for multiple studies (the likelihood of concluding that an effect exists when it does not) can be much lower than that of the individual studies. This in itself is not a problem since a low Type I error rate is desirable. However, there is also an accompanying drop in power, meaning that the probability of finding support for a true effect is low. Currently, there is no standard statistical method for dealing with the hyperconservative error rate and accompanying low power that results from investigations requiring multiple significant outcomes. Here, we propose a novel solution to this problem: We show that it is sometimes appropriate to reverse the logic of the classic Bonferroni correction and increase the significance criterion in order to maintain an intended compound Type I error rate across multiple tests. This reverse Bonferroni approach dramatically improves statistical power and encourages careful planning of statistical analyses prior to data collection. To avoid adding to the list of questionable research practices that seem to contaminate some psychological research, we suggest that reverse Bonferroni be restricted to situations where authors pre-register their analysis plans.
Attention, Perception, & Psychophysics, 2022
Twenty-five years of research has explored the object-based attention effect using the two-rectan... more Twenty-five years of research has explored the object-based attention effect using the two-rectangles paradigm and closely related paradigms. While reading this literature, we noticed statistical attributes that are sometimes related to questionable research practices, which can undermine the reported conclusions. To quantify these attributes, we applied the Test for Excess Success (TES) individually to 37 articles that investigate various properties of object-based attention and comprise four or more experiments. A TES analysis estimates the probability that a direct replication of the experiments in a given article with the same sample sizes would have the same success (or better) as the original article. If the probability is low, then readers should be skeptical about the conclusions that are based on those experimental results. We find that 19 of the 37 analyzed articles (51%) seem too good to be true in that they have a replication probability below 0.1. In a new large sample ...
Recent controversies have questioned the quality of scientific practice in the field of psycholog... more Recent controversies have questioned the quality of scientific practice in the field of psychology, but these concerns are often based on anecdotes and seemingly isolated cases. To gain a broader perspective, this article applies an objective test for excess success to a large set of articles published in the journal Psychological Science between 2009-2012. When empirical studies succeed at a rate much higher than is appropriate for the estimated effects and sample sizes, readers should suspect that unsuccessful findings were suppressed, the experiments or analyses were improper, or that the theory does not properly account for the data. The analyses conclude problems for 82 % (36 out of 44) of the articles in Psychological Science that have four or more experiments and could be analyzed.
Psychonomic Bulletin & Review, 2021
It is common for conclusions of empirical studies to depend on multiple significant outcomes. Thi... more It is common for conclusions of empirical studies to depend on multiple significant outcomes. This practice may seem reasonable, but it has some unintended effects. In particular, the compound Type I error rate for multiple studies (the likelihood of concluding that an effect exists when it does not) can be much lower than that of the individual studies. This in itself is not a problem since a low Type I error rate is desirable. However, there is also an accompanying drop in power, meaning that the probability of finding support for a true effect is low. Currently, there is no standard statistical method for dealing with the hyperconservative error rate and accompanying low power that results from investigations requiring multiple significant outcomes. Here, we propose a novel solution to this problem: We show that it is sometimes appropriate to reverse the logic of the classic Bonferroni correction and increase the significance criterion in order to maintain an intended compound Type I error rate across multiple tests. This reverse Bonferroni approach dramatically improves statistical power and encourages careful planning of statistical analyses prior to data collection. To avoid adding to the list of questionable research practices that seem to contaminate some psychological research, we suggest that reverse Bonferroni be restricted to situations where authors pre-register their analysis plans.
Like other scientists, psychologists believe experimental replication to be the final arbiter for... more Like other scientists, psychologists believe experimental replication to be the final arbiter for determining the validity of an empirical finding. Reports in psychology journals often attempt to prove the validity of a hypothesis or theory with multiple experiments that replicate a finding. Unfortunately, these efforts are sometimes misguided because in a field like experimental psychology, ever more successful replication does not necessarily ensure the validity of an empirical finding. When psychological experiments are analyzed with statistics, the rules of probability dictate that random samples should sometimes be selected that do not reject the null hypothesis, even if an effect is real. As a result, it is possible for a set of experiments to have too many successful replications. When there are too many successful replications for a given set of experiments, a skeptical scientist should be suspicious that null or negative findings have been suppressed, the experiments were r...
Van Lier and Vergeer (2008) took first prize in the 2008 Best Visual Illusion of the Year contest... more Van Lier and Vergeer (2008) took first prize in the 2008 Best Visual Illusion of the Year contest by demonstrating an effect in which afterimage colors spread across regions that were not colored in the inducing image. An experiment measuring the effect was recently published by van Lier, Vergeer, and Anstis (2009). The effect is extremely robust and can be experienced with the images in the stimulus column of Figures 1A and 1B. The inducing stimulus is the eight-pointed star whose points alternate spatially between red and cyan. The middle of the star is an achromatic gray color. Fixation of the inducing stimulus for about 1 sec can lead to an interesting afterimage that depends on the properties of the viewing surface. An eye movement from the inducing image to the middle of the first four-pointed outline star produces an afterimage percept of a faintly cyanish star. Significantly, the cyan color is not restricted to the star’s points, but spreads across the region that was an ach...
Francis (2012) concluded that the findings on wishful seeing described in Balcetis and Dunning (2... more Francis (2012) concluded that the findings on wishful seeing described in Balcetis and Dunning (2010) appeared to contain publication bias. In a reply, Balcetis and Dunning (2012), henceforth B&D's response, raised a number of interesting observations and claims. We agree on some points and disagree on other points, and there are several issues that remain unclear. So that it can end on a positive note, this rebuttal will start with the disagreements and finish with the agreements. Areas of disagreement B&D's response claims Francis (2012) made a false-positive error Francis (2012) concluded publication bias in Balcetis and Dunning (2010) because the estimated probability that all five experiments would reject the null hypothesis was less than 0.1. Given the use of a seemingly similar criterion for traditional hypothesis testing, it might appear that there would be a false positive rate of 0.1, but the actual false positive rate for the test is much lower.
Simons et al. (2006) described induced scene fading by using low-pass filtered photographs of var... more Simons et al. (2006) described induced scene fading by using low-pass filtered photographs of various natural scenes. With the passage of time, fading gradually became stronger and, removal of scattered disks enhanced the fading effect drastically.
A fundamental characteristic of human visual perception is the ability to group together disparat... more A fundamental characteristic of human visual perception is the ability to group together disparate elements in a scene and treat them as a single unit. The mechanisms by which humans create such groupings remain unknown, but grouping seems to play an important role in a wide variety of visual phenomena, and a good understanding of these mechanisms might provide guidance for how to improve machine vision algorithms. Here, we build on a proposal that some groupings are the result of connections in cortical area V2 that join disparate elements, thereby allowing them to be selected and segmented together. In previous instantiations of this proposal, connection formation was based on the anatomy (e.g., extent) of receptive fields, which made connection formation obligatory when the stimulus conditions stimulate the corresponding receptive fields. We now propose dynamic circuits that provide greater flexibility in the formation of connections and that allow for top-down control of percept...
Based on findings from six experiments, Dallas, Liu & Ubel (2019) concluded that placing calorie ... more Based on findings from six experiments, Dallas, Liu & Ubel (2019) concluded that placing calorie labels to the left of menu items influences consumers to choose lower calorie food options. Contrary to previously reported findings, they suggested that calorie labels do influence food choices, but only when placed to the left because they are in this case read first. If true, these findings have important implications for the design of menus and may help address the obesity pandemic. However, an analysis of the reported results indicates that they seem too good to be true. We show that if the effect sizes in Dallas et al. (2019) are representative of the populations, a replication of the six studies (with the same sample sizes) has a probability of only 0.014 of producing uniformly significant outcomes. Such a low success rate suggests that the original findings might be the result of questionable research practices or publication bias. We therefore caution readers and policy makers t...
Consciousness and Cognition, 2019
Recent studies suggest that the accuracy of perceptual judgments can be influenced by the perceiv... more Recent studies suggest that the accuracy of perceptual judgments can be influenced by the perceived illusory size of a stimulus, with judgments being more accurate for increased illusory size. This phenomenon seems consistent with recent neuroscientific findings that representations in early visual areas reflect the perceived (illusory) size of stimuli rather than the physical size. We further explored this idea with the moon illusion, in which the moon appears larger when it is close to the horizon and smaller when it is higher in the sky. Participants (= n 230) adjusted the orientation of an image of the moon on a smartphone to match the perceived orientation of the moon in the sky. Contrary to previous studies that investigated accuracy and size illusions, we found slightly lower perceptual judgment accuracy when the moon appeared large (close to the horizon) compared to when it appeared small (high in the sky).
Collabra: Psychology, 2019
Dong, Huang, and Zhong (2015) report five successful experiments linking brightness perception wi... more Dong, Huang, and Zhong (2015) report five successful experiments linking brightness perception with the feeling of hopelessness. They argue that a gloomy future is psychologically represented as darkness, not just metaphorically but as an actual perceptual bias. Based on multiple results, they conclude that people who feel hopeless perceive their environment as darker and therefore prefer brighter lighting than controls. Reversely, dim lighting caused participants to feel more hopeless. However, the experiments succeed at a rate much higher than predicted by the magnitude of the reported effects. Based on the reported statistics, the estimated probability of all five experiments being fully successful, if replicated with the same sample sizes, is less than 0.016. This low rate suggests that the original findings are (perhaps unintentionally) the result of questionable research practices or publication bias. Readers should therefore be skeptical about the original results and conclus...
Frontiers in psychology, 2016
In response to concerns about the validity of empirical findings in psychology, some scientists u... more In response to concerns about the validity of empirical findings in psychology, some scientists use replication studies as a way to validate good science and to identify poor science. Such efforts are resource intensive and are sometimes controversial (with accusations of researcher incompetence) when a replication fails to show a previous result. An alternative approach is to examine the statistical properties of the reported literature to identify some cases of poor science. This review discusses some details of this process for prominent findings about racial bias, where a set of studies seems "too good to be true." This kind of analysis is based on the original studies, so it avoids criticism from the original authors about the validity of replication studies. The analysis is also much easier to perform than a new empirical study. A variation of the analysis can also be used to explore whether it makes sense to run a replication study. As demonstrated here, there are s...
Cognitive research: principles and implications, 2016
In some circumstances, people interact with a virtual keyboard by triggering a binary switch to g... more In some circumstances, people interact with a virtual keyboard by triggering a binary switch to guide a moving cursor to target characters or items. Such switch keyboards are commonly used by patients with severely restricted motor capabilities. Typing with such systems enables patients to interact with colleagues, but it is slow and error prone. We develop a methodology that can automate an important part of the design process for optimally structured switch keyboards. We show how to optimize the design of simple switch keyboard systems in a way that minimizes the average entry time while satisfying an acceptable error rate. The first step is to model the user's ability to use a switch keyboard correctly for different cursor durations. Once the model is defined, our optimization approach assigns characters to locations on the keyboard, identifies an optimal cursor duration, and considers a variety of cursor paths. For our particular case, we show how to build a user model from ...
Behavior Research Methods, 2016
Recent reform efforts in psychological science have led to a plethora of choices for scientists t... more Recent reform efforts in psychological science have led to a plethora of choices for scientists to analyze their data. A scientist making an inference about their data must now decide whether to report a p value, summarize the data with a standardized effect size and its confidence interval, report a Bayes Factor, or use other model comparison methods. To make good choices among these options, it is necessary for researchers to understand the characteristics of the various statistics used by the different analysis frameworks. Toward that end, this paper makes two contributions. First, it shows that for the case of a two-sample t test with known sample sizes, many different summary statistics are mathematically equivalent in the sense that they are based on the very same information in the data set. When the sample sizes are known, the p value provides as much information about a data set as the confidence interval of Cohen's d or a JZS Bayes factor. Second, this equivalence means that different analysis methods differ only in their interpretation of the empirical data. At first glance, it might seem that mathematical equivalence of the statistics suggests that it does not matter much which statistic is reported, but the opposite is true because the appropriateness of a reported statistic is relative to the inference it promotes. Accordingly, scientists should choose an analysis method appropriate for their scientific investigation. A
PloS one, Apr 4, 2024
Nitric oxide (NO) is involved in a variety of biological functions including blood vessel dilatio... more Nitric oxide (NO) is involved in a variety of biological functions including blood vessel dilation and neurotransmitter release. In animals, NO has been demonstrated to affect multiple behavioral outcomes, such as memory performance and arousal, whereas this link is less explored in humans. NO is created in the paranasal sinuses and studies show that humming releases paranasal NO to the nasal tract and that NO can then cross the blood brain barrier. Akin to animal models, we hypothesized that this NO may traverse into the brain and positively affect information processing. In contrast to our hypothesis, an articulatory suppression memory paradigm and a speeded detection task found deleterious effects of humming while performing the task. Likewise, we found no effect of humming on emotional processing of photos. In a fourth experiment, participants hummed before each trial in a speeded detection task, but we again found no effect on response time. In conclusion, either nasal NO does not travel to the brain, or NO in the brain does not have the expected impact on cognitive performance and emotional processing in humans. It remains possible that NO influences other cognitive processes not tested for here.
Learning materials in biosciences, 2019
What You Will Learn in This Chapter In Chap. 3, we examined how to compare the means of two group... more What You Will Learn in This Chapter In Chap. 3, we examined how to compare the means of two groups. In this chapter, we will examine how to compare means of more than two groups. 6.1 One-Way Independent Measures ANOVA Suppose we wish to examine the effects of geographic region on tree heights. We might sample trees from near the equator, from the 49th parallel, and from the 60th parallel. We want to know whether the mean tree heights from all three regions are the same (Fig. 6.1). Since we have three regions we cannot use the t-test because the t-test only works if we are comparing two groups.
Journal of Vision, Sep 27, 2021
Perception, 2019
In crowding, perception of a target deteriorates in the presence of nearby flankers. In the tradi... more In crowding, perception of a target deteriorates in the presence of nearby flankers. In the traditional feedforward framework of vision, only elements within Bouma’s window interfere with the target and adding more elements always leads to stronger crowding. Crowding is usually studied with sparse displays that involve only a few flankers, as too many flankers lead to a combinatorial explosion of display configurations. To deal with these challenges, Van der Burg et al. (2017) proposed a paradigm to measure crowding in dense displays using genetic algorithms. In their study, displays were selected and combined over several generations to maximize human performance. Van der Burg et al. found that only the target’s nearest neighbours affect performance in their displays. Here, we used the same paradigm, but the displays were selected according to the performance of visual crowding models. We found that all models based on the traditional framework of vision tested so far produce results in which all elements within Bouma’s window affect performance in dense displays, contrary to human behavior. The only model that explains the results of Van der Burg et al. has a dedicated grouping process. We conclude that a grouping stage is crucial to understand visual processing
Frontiers in Psychology, Sep 22, 2016
His primary research interest is to develop and test neural network models of visual perception.
Psychonomic Bulletin & Review, Jan 19, 2021
It is common for conclusions of empirical studies to depend on multiple significant outcomes. Thi... more It is common for conclusions of empirical studies to depend on multiple significant outcomes. This practice may seem reasonable, but it has some unintended effects. In particular, the compound Type I error rate for multiple studies (the likelihood of concluding that an effect exists when it does not) can be much lower than that of the individual studies. This in itself is not a problem since a low Type I error rate is desirable. However, there is also an accompanying drop in power, meaning that the probability of finding support for a true effect is low. Currently, there is no standard statistical method for dealing with the hyperconservative error rate and accompanying low power that results from investigations requiring multiple significant outcomes. Here, we propose a novel solution to this problem: We show that it is sometimes appropriate to reverse the logic of the classic Bonferroni correction and increase the significance criterion in order to maintain an intended compound Type I error rate across multiple tests. This reverse Bonferroni approach dramatically improves statistical power and encourages careful planning of statistical analyses prior to data collection. To avoid adding to the list of questionable research practices that seem to contaminate some psychological research, we suggest that reverse Bonferroni be restricted to situations where authors pre-register their analysis plans.
Attention, Perception, & Psychophysics, 2022
Twenty-five years of research has explored the object-based attention effect using the two-rectan... more Twenty-five years of research has explored the object-based attention effect using the two-rectangles paradigm and closely related paradigms. While reading this literature, we noticed statistical attributes that are sometimes related to questionable research practices, which can undermine the reported conclusions. To quantify these attributes, we applied the Test for Excess Success (TES) individually to 37 articles that investigate various properties of object-based attention and comprise four or more experiments. A TES analysis estimates the probability that a direct replication of the experiments in a given article with the same sample sizes would have the same success (or better) as the original article. If the probability is low, then readers should be skeptical about the conclusions that are based on those experimental results. We find that 19 of the 37 analyzed articles (51%) seem too good to be true in that they have a replication probability below 0.1. In a new large sample ...
Recent controversies have questioned the quality of scientific practice in the field of psycholog... more Recent controversies have questioned the quality of scientific practice in the field of psychology, but these concerns are often based on anecdotes and seemingly isolated cases. To gain a broader perspective, this article applies an objective test for excess success to a large set of articles published in the journal Psychological Science between 2009-2012. When empirical studies succeed at a rate much higher than is appropriate for the estimated effects and sample sizes, readers should suspect that unsuccessful findings were suppressed, the experiments or analyses were improper, or that the theory does not properly account for the data. The analyses conclude problems for 82 % (36 out of 44) of the articles in Psychological Science that have four or more experiments and could be analyzed.
Psychonomic Bulletin & Review, 2021
It is common for conclusions of empirical studies to depend on multiple significant outcomes. Thi... more It is common for conclusions of empirical studies to depend on multiple significant outcomes. This practice may seem reasonable, but it has some unintended effects. In particular, the compound Type I error rate for multiple studies (the likelihood of concluding that an effect exists when it does not) can be much lower than that of the individual studies. This in itself is not a problem since a low Type I error rate is desirable. However, there is also an accompanying drop in power, meaning that the probability of finding support for a true effect is low. Currently, there is no standard statistical method for dealing with the hyperconservative error rate and accompanying low power that results from investigations requiring multiple significant outcomes. Here, we propose a novel solution to this problem: We show that it is sometimes appropriate to reverse the logic of the classic Bonferroni correction and increase the significance criterion in order to maintain an intended compound Type I error rate across multiple tests. This reverse Bonferroni approach dramatically improves statistical power and encourages careful planning of statistical analyses prior to data collection. To avoid adding to the list of questionable research practices that seem to contaminate some psychological research, we suggest that reverse Bonferroni be restricted to situations where authors pre-register their analysis plans.
Like other scientists, psychologists believe experimental replication to be the final arbiter for... more Like other scientists, psychologists believe experimental replication to be the final arbiter for determining the validity of an empirical finding. Reports in psychology journals often attempt to prove the validity of a hypothesis or theory with multiple experiments that replicate a finding. Unfortunately, these efforts are sometimes misguided because in a field like experimental psychology, ever more successful replication does not necessarily ensure the validity of an empirical finding. When psychological experiments are analyzed with statistics, the rules of probability dictate that random samples should sometimes be selected that do not reject the null hypothesis, even if an effect is real. As a result, it is possible for a set of experiments to have too many successful replications. When there are too many successful replications for a given set of experiments, a skeptical scientist should be suspicious that null or negative findings have been suppressed, the experiments were r...
Van Lier and Vergeer (2008) took first prize in the 2008 Best Visual Illusion of the Year contest... more Van Lier and Vergeer (2008) took first prize in the 2008 Best Visual Illusion of the Year contest by demonstrating an effect in which afterimage colors spread across regions that were not colored in the inducing image. An experiment measuring the effect was recently published by van Lier, Vergeer, and Anstis (2009). The effect is extremely robust and can be experienced with the images in the stimulus column of Figures 1A and 1B. The inducing stimulus is the eight-pointed star whose points alternate spatially between red and cyan. The middle of the star is an achromatic gray color. Fixation of the inducing stimulus for about 1 sec can lead to an interesting afterimage that depends on the properties of the viewing surface. An eye movement from the inducing image to the middle of the first four-pointed outline star produces an afterimage percept of a faintly cyanish star. Significantly, the cyan color is not restricted to the star’s points, but spreads across the region that was an ach...
Francis (2012) concluded that the findings on wishful seeing described in Balcetis and Dunning (2... more Francis (2012) concluded that the findings on wishful seeing described in Balcetis and Dunning (2010) appeared to contain publication bias. In a reply, Balcetis and Dunning (2012), henceforth B&D's response, raised a number of interesting observations and claims. We agree on some points and disagree on other points, and there are several issues that remain unclear. So that it can end on a positive note, this rebuttal will start with the disagreements and finish with the agreements. Areas of disagreement B&D's response claims Francis (2012) made a false-positive error Francis (2012) concluded publication bias in Balcetis and Dunning (2010) because the estimated probability that all five experiments would reject the null hypothesis was less than 0.1. Given the use of a seemingly similar criterion for traditional hypothesis testing, it might appear that there would be a false positive rate of 0.1, but the actual false positive rate for the test is much lower.
Simons et al. (2006) described induced scene fading by using low-pass filtered photographs of var... more Simons et al. (2006) described induced scene fading by using low-pass filtered photographs of various natural scenes. With the passage of time, fading gradually became stronger and, removal of scattered disks enhanced the fading effect drastically.
A fundamental characteristic of human visual perception is the ability to group together disparat... more A fundamental characteristic of human visual perception is the ability to group together disparate elements in a scene and treat them as a single unit. The mechanisms by which humans create such groupings remain unknown, but grouping seems to play an important role in a wide variety of visual phenomena, and a good understanding of these mechanisms might provide guidance for how to improve machine vision algorithms. Here, we build on a proposal that some groupings are the result of connections in cortical area V2 that join disparate elements, thereby allowing them to be selected and segmented together. In previous instantiations of this proposal, connection formation was based on the anatomy (e.g., extent) of receptive fields, which made connection formation obligatory when the stimulus conditions stimulate the corresponding receptive fields. We now propose dynamic circuits that provide greater flexibility in the formation of connections and that allow for top-down control of percept...
Based on findings from six experiments, Dallas, Liu & Ubel (2019) concluded that placing calorie ... more Based on findings from six experiments, Dallas, Liu & Ubel (2019) concluded that placing calorie labels to the left of menu items influences consumers to choose lower calorie food options. Contrary to previously reported findings, they suggested that calorie labels do influence food choices, but only when placed to the left because they are in this case read first. If true, these findings have important implications for the design of menus and may help address the obesity pandemic. However, an analysis of the reported results indicates that they seem too good to be true. We show that if the effect sizes in Dallas et al. (2019) are representative of the populations, a replication of the six studies (with the same sample sizes) has a probability of only 0.014 of producing uniformly significant outcomes. Such a low success rate suggests that the original findings might be the result of questionable research practices or publication bias. We therefore caution readers and policy makers t...
Consciousness and Cognition, 2019
Recent studies suggest that the accuracy of perceptual judgments can be influenced by the perceiv... more Recent studies suggest that the accuracy of perceptual judgments can be influenced by the perceived illusory size of a stimulus, with judgments being more accurate for increased illusory size. This phenomenon seems consistent with recent neuroscientific findings that representations in early visual areas reflect the perceived (illusory) size of stimuli rather than the physical size. We further explored this idea with the moon illusion, in which the moon appears larger when it is close to the horizon and smaller when it is higher in the sky. Participants (= n 230) adjusted the orientation of an image of the moon on a smartphone to match the perceived orientation of the moon in the sky. Contrary to previous studies that investigated accuracy and size illusions, we found slightly lower perceptual judgment accuracy when the moon appeared large (close to the horizon) compared to when it appeared small (high in the sky).
Collabra: Psychology, 2019
Dong, Huang, and Zhong (2015) report five successful experiments linking brightness perception wi... more Dong, Huang, and Zhong (2015) report five successful experiments linking brightness perception with the feeling of hopelessness. They argue that a gloomy future is psychologically represented as darkness, not just metaphorically but as an actual perceptual bias. Based on multiple results, they conclude that people who feel hopeless perceive their environment as darker and therefore prefer brighter lighting than controls. Reversely, dim lighting caused participants to feel more hopeless. However, the experiments succeed at a rate much higher than predicted by the magnitude of the reported effects. Based on the reported statistics, the estimated probability of all five experiments being fully successful, if replicated with the same sample sizes, is less than 0.016. This low rate suggests that the original findings are (perhaps unintentionally) the result of questionable research practices or publication bias. Readers should therefore be skeptical about the original results and conclus...
Frontiers in psychology, 2016
In response to concerns about the validity of empirical findings in psychology, some scientists u... more In response to concerns about the validity of empirical findings in psychology, some scientists use replication studies as a way to validate good science and to identify poor science. Such efforts are resource intensive and are sometimes controversial (with accusations of researcher incompetence) when a replication fails to show a previous result. An alternative approach is to examine the statistical properties of the reported literature to identify some cases of poor science. This review discusses some details of this process for prominent findings about racial bias, where a set of studies seems "too good to be true." This kind of analysis is based on the original studies, so it avoids criticism from the original authors about the validity of replication studies. The analysis is also much easier to perform than a new empirical study. A variation of the analysis can also be used to explore whether it makes sense to run a replication study. As demonstrated here, there are s...
Cognitive research: principles and implications, 2016
In some circumstances, people interact with a virtual keyboard by triggering a binary switch to g... more In some circumstances, people interact with a virtual keyboard by triggering a binary switch to guide a moving cursor to target characters or items. Such switch keyboards are commonly used by patients with severely restricted motor capabilities. Typing with such systems enables patients to interact with colleagues, but it is slow and error prone. We develop a methodology that can automate an important part of the design process for optimally structured switch keyboards. We show how to optimize the design of simple switch keyboard systems in a way that minimizes the average entry time while satisfying an acceptable error rate. The first step is to model the user's ability to use a switch keyboard correctly for different cursor durations. Once the model is defined, our optimization approach assigns characters to locations on the keyboard, identifies an optimal cursor duration, and considers a variety of cursor paths. For our particular case, we show how to build a user model from ...
Behavior Research Methods, 2016
Recent reform efforts in psychological science have led to a plethora of choices for scientists t... more Recent reform efforts in psychological science have led to a plethora of choices for scientists to analyze their data. A scientist making an inference about their data must now decide whether to report a p value, summarize the data with a standardized effect size and its confidence interval, report a Bayes Factor, or use other model comparison methods. To make good choices among these options, it is necessary for researchers to understand the characteristics of the various statistics used by the different analysis frameworks. Toward that end, this paper makes two contributions. First, it shows that for the case of a two-sample t test with known sample sizes, many different summary statistics are mathematically equivalent in the sense that they are based on the very same information in the data set. When the sample sizes are known, the p value provides as much information about a data set as the confidence interval of Cohen's d or a JZS Bayes factor. Second, this equivalence means that different analysis methods differ only in their interpretation of the empirical data. At first glance, it might seem that mathematical equivalence of the statistics suggests that it does not matter much which statistic is reported, but the opposite is true because the appropriateness of a reported statistic is relative to the inference it promotes. Accordingly, scientists should choose an analysis method appropriate for their scientific investigation. A