Experimental Basis of Special Relativity (original) (raw)

Experimental Basis of Special Relativity [Physics FAQ][Copyright]

By Tom Roberts (look me up in the FNAL Phonebook) and Siegmar Schleif (email), 2007.
HTML/CSS coding and copyediting: John Dlugosz, 2007.

There has been a renaissance in tests of special relativity (SR), in part because considerations of quantum gravity imply that SR may well be violated at appropriate scales (very small distance, very high energy).

  1. Introduction
  2. Early experiments (Pre 1905)
    Roentgen, Eichenwald, Wilson, Rayleigh, Arago, Fizeau, Hoek, Bradley, Airy.
  3. Tests of Einstein's Two Postulates
  4. Tests of Time Dilation and Transverse Doppler Effect
  5. Tests of the Twin Paradox
  6. Tests of Relativistic Kinematics
  7. Tests of Length Contraction
    • Magnetic Force
  8. Recent Tests of CPT and Lorentz Invariance
    • Many Ingenious and Precise experiments of different types
  9. Other Experiments
  10. Experiments that Apparently are_not_ Consistent with SR/GR
  1. Acknowledgments

1. Introduction

Physics is an experimental science, and as such the experimental basis for any physical theory is extremely important. The relationship between theory and experiments in modern science is a multi-edged sword:

  1. It is required that the theory not be refuted by any undisputed experiment within the theory's domain of applicability.
  2. It is expected that the theory be confirmed by a number of experiments that:
    • cover a significant fraction of the theory's domain of applicability.
    • examine a significant fraction of the theory's predictions.

At present, special relativity (SR) meets all of these requirements and expectations. Literally hundreds of experiments have tested SR, with an enormous range and diversity, and the agreement between theory and experiment is excellent. There is a lot of redundancy in these experimental tests. Also, many indirect tests of SR are not included here. This list of experiments is by no means complete!

Other than their sheer numbers, the most striking thing about these experimental tests of SR is their remarkable breadth and diversity. An important aspect of SR is its universality—it applies to all known physical phenomena and not just to the electromagnetic phenomena it was originally invented to explain. In these experiments you will find tests using electromagnetic and nuclear measurements (including both strong and weak interactions). Gravitational tests are the province of general relativity, and are not considered here.

There are several useful surveys of the experimental basis of SR:

Zhang's book is especially comprehensive. The LivingReviews article goes into considerable detail relating current theoretical ideas to experimental tests.

Textbooks with good summaries of the experimental basis of relativity are:

Albert Einstein introduced the world to special relativity in his seminal 1905 paper: A. Einstein, "Zur Elektrodynamik bewegter K�rper", Ann. d. Physik, 17, 1905 ("On the Electrodynamics of Moving Bodies"). It is available in several forms:

Note, however, that SR is not perfect (in agreement with every experiment), and there are some experiments that are in disagreement with its predictions. See Experiments that Apparently are not Consistent with SR where some of these experiments are referenced and discussed. It is clear that most, if not all, of these experiments have difficulties that are unrelated to SR. Note also that few if any standard references or textbooks even mention the possibility that some experiments might be inconsistent with SR, and there are also aspects of publication bias in the literature. That being said, as of this writing there are no reproducible and generally accepted experiments that are inconsistent with SR, within its domain of applicability.

Technically, the basis of SR is Lorentz invariance, and many recent articles phrase it that way. This is closely related to the CPT theorem, and many of the recent experiments apply to both Lorentz invariance and CPT. Recently there have been conferences on Lorentz invariance and CPT violations:

Much of the renewed interest in testing SR has come from considerations of quantum gravity, which imply that at a suitable scale (very small distance, very high energy) SR might well be violated. Here are some review articles (ordered less to more technical):

Domain of Applicability

The domain of applicability of a physical theory is the set of physical situations in which the theory is valid. For SR, this is measurements of distance, time, momentum, energy, etc. in inertial frames. Calculus can be used to apply SR in accelerated systems. A more technical definition is that SR is valid only in flat Lorentz manifolds topologically equivalent to R4.

In particular, any experiment in which the effects of gravitation are important is outside the domain of SR. Because SR is the local limit of general relativity, it is possible to compute how large an error is made when one applies SR to a situation that is approximately but not exactly inertial, such as the common case of experimental apparatus supported against gravity on Earth's surface. In many cases (e.g. most optical and elementary-particle experiments on the rotating Earth's surface) these errors are vastly smaller than the experimental resolution, and SR can be accurately applied.

Test Theories of SR

A test theory of SR is a generalization of the Lorentz transforms of SR using additional parameters. One can then analyze experiments using the test theory (rather than SR itself) and fit the parameters of the test theory to the experimental results. If the fitted parameter values differ significantly from the values corresponding to SR, then the experiment is inconsistent with SR. But more normally, such fits can show how well a given experiment confirms or disagrees with SR, and what the experimental accuracy is for doing so. This gives a general and tractable method of analysis which can be common to multiple experiments.

Different test theories differ in their assumptions about what form the transform equations could reasonably take. There are at present four test theories of SR:

Zhang discusses their interrelationships and presents a unified test theory encompassing the other three, but with a better and more interpretable parametrisation. His discussion implies that there will be no more test theories of SR that are not reducible to one of the first three.

Robertson showed that one can unambiguously deduce the Lorentz transform of SR to an accuracy of ~0.1% from the following three experiments: Michelson and Morley, Kennedy and Thorndike, Ives and Stilwell. Zhang showed that modern experiments determine the Lorentz transforms to within a few parts per million.

These test theories can also be used to examine potential alternative theories to SR; such alternative theories predict particular values of the parameters of the test theory, which can easily be compared to values determined by experiments analyzed with the test theory. The existing experiments put rather strong experimental constraints on any alternative theory.

In particular, Zhang showed that these experimental limits essentially require that any theory based upon the existence of an aether be experimentally indistinguishable from SR, and have an aether frame which is unobservable (the only alternative is for a theory to "live in the error bars" of the experiments, which is quite difficult given the high accuracies achieved by many of these experiments). Note also that some of the parameters in these test theories are not determined at all by SR (or by experiments)—this means that many different theories, characterized by different values of such parameters, are equivalent to SR in that they are experimentally indistinguishable from SR (though they differ from SR in other aspects).

In addition there is the "standard model extension (SME)" of Colladay and Kosteleck� that extends the standard model of particle physics with various plausible Lorentz-violating terms. This is in the context of quantum field theory, and is well beyond the scope of this article (which is limited to SR and its immediate consequences). The goal of many of the recent tests is to put limits on the many parameters of this test theory. Colladay and Kosteleck�, Phys. Rev. D55 (1997) pg 6760 (arxiv:hep-ph/9703464), andPhys. Rev. D 58, 116002 (1998) (arxiv:hep-ph/9809521).

Optical Extinction

Many measurements of the speed of light involve the passage of the light through some material medium. This can invalidate some conclusions of the measurement due to the extinction theorem of Ewald and Oseen. This theorem states that the speed of light will approach the speed c/n relative to the medium (n is its index of refraction), and it also determines how long a path length is required for that approach. The distance required depends strongly on the index of refraction of the medium and the wavelength: for visible light and optical glass it is less than a micron, for air it is about a millimetre, and for the intergalactic medium it is several light years. So even astronomical observations over vast distances in the "vacuum" of outer space are not immune from the effects of this theorem. Note this theorem is based purely on classical electrodynamics, and for gamma rays detected as individual particles it does not apply; it is also not clear how it would apply to theories other than SR and classical electrodynamics. See for instance:J.G. Fox, Am. J. Phys. 30, pg 297 (1962), JOSA 57, pg 967 (1967), and AJP 33, pg 1 (1964). An elementary discussion is given in Ballenegger and Weber, AJP 67, pg 599 (1999). The standard reference for this is Born and Wolf, Principles of Optics, and the original paper is Oseen, Ann. der Physik 48, pg 1, 1915.

2. Early Experiments (Pre 1905)

The special theory of relativity (SR) was invented in 1905 by Einstein to explain several experimental results. Since then it has been found to explain a wide range of experimental results. SR is not a mathematical game or just a hypothesis. SR is a physical theory that has been well tested many times.

A detailed account of the early history of SRT is given in: Arthur L. Miller,Albert Einstein's Special Theory of Relativity: Emergence and early interpretation, Addison Wesley, 1981, ISBN 0-201-04680-6.

When A. Einstein wrote his famous paper: "The Electrodynamics of Moving Bodies" in 1905, he already had experimental support for his new theory:

...Examples of this sort, together with the unsuccessful attempts to discover any motion of Earth relative to the "light medium", suggest that the phenomena of electrodynamics as well as of mechanics possess no properties corresponding to the idea of absolute rest. They suggest rather that, as has already been shown to the first order of small quantities, the same laws of electrodynamics and optics will be valid for all frames of reference for which the equations of mechanics hold good...

What was the experimental support for this claim? There were several experiments concerning the electrodynamics of moving bodies that are not very well known today; but Einstein knew of numerous examples. Many of these experiments were reviewed in H.A. Lorentz's important paper "On the influence of the earth's motion on luminiferous phenomena", Versl. Kon. Akad. Wettensh. Amsterdam, 297 (1886). Lorentz showed that Stokes' theory of light, which assumed complete dragging of the aether at Earth's surface and decreasing to zero dragging far away, had severe problems with aberration and the results of Arago and Airy.

3. Tests of Einstein's two Postulates

  1. The laws by which the states of physical systems undergo change are not affected, whether these changes of state be referred to the one or the other of two systems of coordinates in uniform translatory motion.
  2. Any ray of light moves in the "stationary" system of coordinates with determined speed c, whether the ray be emitted by a stationary or by a moving body.

—Einstein, Ann. d. Physik17 (1905); translated by Perrett and Jeffery; reprinted in: Einstein, Lorentz, Weyl, Minkowski,The Principle of Relativity, Dover 1952.

"Stationary" was defined in the first paragraph of this section:

Let us take a system of coordinates in which the equations of newtonian mechanics hold good. In order to render our presentation more precise and to distinguish this system of coordinates verbally from others that will be introduced hereafter, we call it the "stationary system".

Ibid.

It is clear that the word "stationary" is used merely as a label, and implies no "absolute" aspects at all.

[Editor's note: The phrase "stationary system" here denotes an inertial frame. By a "system of coordinates", Einstein ultimately means an inertial frame. A distinction should be made between frames (which embody physics) and coordinates (which embody maths). Because Einstein allocated distinct coordinates to distinct frames, he used the terms interchangeably, and they are still used interchangeably by most physicists. But it's important to realise that they are different entities. This is also true when doing calculations in, say, projectile motion, where we must include several frames and coordinate systems in one calculation. There, it's well known that interchanging "frame" and "coordinates" leads to great confusion.]

3.1 Round-Trip Tests of Light-Speed Isotropy

The speed of light is said to be isotropic if it has the same value when measured in any/every direction.

The Michelson–Morley Experiment (the MMX)

The Michelson–Morley experiment (MMX) was intended to measure Earth's velocity relative to the "lumeniferous aether" which was at the time presumed to carry electromagnetic phenomena. The failure of it and the other early experiments to actually observe Earth's motion through the aether became significant in promoting the acceptance of Einstein's theory of special relativity, as it was appreciated from early on that Einstein's approach (via symmetry) was more elegant and parsimonious of assumptions than were other approaches (e.g. those of Maxwell, Hertz, Stokes, Fresnel, Lorentz, Ritz, and Abraham).

The following table comes from R.S. Shankland et al., Rev. Mod. Phys. 27 no. 2, pg 167–178 (1955), which includes references to each experiment (resolution and the limit on vaether are from the original sources). The expected fringe shift is what would be expected for a rigid aether at rest with respect to the Sun and Earth's orbital speed (~30 km/s).

Name Year Arm length(metres) Fringe shift ExperimentalResolution(see note) Upper Limiton vaether
expected measured
Michelson 1881 01.2 0.04 0.02
Michelson + Morley 1887 11.0 0.4 <0.01 8 km/s
Morley + Morley 1902–04 32.2 1.13 0.015
Miller 1921 32.0 1.12 0.08
Miller 1923–24 32.0 1.12 0.03
Miller (Sunlight) 1924 32.0 1.12 0.014
Tomascheck (Starlight) 1924 08.6 0.3 0.02
Miller 1925–26 32.0 1.12 0.088
Miller (re-analysis in 2006, see note) 1925–29 32.0 1.12 0.000 0.015 6 km/s
Kennedy (Mt Wilson) 1926 02.0 0.07 0.002
Illingworth 1927 02.0 0.07 0.0002 0.0006 1 km/s
Piccard + Stahel (Mt Rigi) 1927 02.8 0.13 0.006
Michelson et al. 1929 25.9 0.9 0.01
Joos 1930 21.0 0.75 0.002

Note: before about 1950 it was common to not perform a detailed error analysis, and to not report error bars or resolutions.

Note: the re-analysis of Miller's 1925–29 results is: T.J. Roberts, "An Explanation of Dayton Miller's Anomalous ‘Ether Drift' Results", arXiv:physics/0608238. There is more discussion of this below.

See also: Brillet and Hall.

The Kennedy–Thorndike Experiment

See also: Hils and Hall.

Modern Laser / Maser Tests of Light-Speed Isotropy

Other Experiments

3.2 One-Way Tests of Light-Speed Isotropy

Note that while these experiments clearly use a one-way light path and find isotropy, they are inherently unable to rule out a large class of theories in which the one-way speed of light is anisotropic. These theories share the property that the round-trip speed of light is isotropic in any inertial frame, but the one-way speed is isotropic only in an aether frame. In all of these theories the effects of slow clock transport exactly offset the effects of the anisotropic one-way speed of light (in any inertial frame), and all are experimentally indistinguishable from SR. All of these theories predict null results for these experiments. See Test Theories above, especially Zhang (in which these theories are called "Edwards frames").

3.3 Tests of Light Speed from Moving Sources

If the light emitted from a source moving with velocity v toward the observer has a speedc+kv in the observer's frame, then these experiments place a limit on k. Many but not all of these experiments are subject to criticism due to Optical Extinction.

Experiments Using Cosmological Sources

Experiments Using Terrestrial Sources

3.4 Measurements of the Speed of Light, and Other Limits on it

In 1983 the international standard for the metre was redefined in terms of the definition of the second and a defined value for the speed of light. The defined value was chosen to be as consistent as possible with the earlier metrological definitions of the metre and the second. Since then it is not possible to measure the speed of light using the current metrological standards, but one can still measure any anisotropy in its speed, or use an earlier definition of the metre if necessary.

Limits on Velocity Variations with Frequency

Limits on the Photon Mass

See also the Particle Data Group's summary on "Gauge and Higgs Bosons". As of July 2007, their reported limit on the photon mass is 6 × 10−17 eV/c2.

3.5 Tests of the Principle of Relativity and Lorentz Invariance

Einstein's first postulate, the principle of relativity (PoR), essentially states that the laws of physics do not vary for different inertial frames. Most if not all of the tests of his second postulate (the isotropy experiments above) could also be placed in this section, as could those in the following section on the isotropy of space.

The Trouton–Noble Experiment

Other Experiments

3.6 Tests of the Isotropy of Space

See also Brillet and Hall.

Recent High-Resolution Tests using Cavities

4. Tests of Time Dilation and Transverse Doppler Effect

The Doppler effect is the observed variation in frequency of a source when it is observed by a detector that is moving relative to the source. This effect is most pronounced when the source is moving directly toward or away from the detector, and in pre-relativity physics its value was zero for transverse motion (motion perpendicular to the source–detector line). In SR there is a non-zero Doppler effect for transverse motion, due to the relative time dilation of the source as seen by the detector. Measurements of Doppler shifts for sources moving with velocities approaching c can test the validity of SR's prediction for such observations, which differs significantly from classical predictions; the experiments support SR and are in complete disagreement with non-relativistic predictions.

Review Article

The Ives and Stilwell Experiment

See also Mandelberg and Witten.

Measurements of Particle Lifetimes

Doppler Shift Measurements

5. Tests of the "Twin Paradox"

The "twin paradox" occurs when two clocks are synchronized, separated, and rejoined. If one clock remains in an inertial frame, then the other must be accelerated sometime during its journey, and it displays less elapsed proper time than the inertial clock. This is a paradox only in that it appears to be inconsistent but is not.

The Clock Postulate

The clock postulate states that the tick rate of a clock when measured in an inertial frame depends only upon its speed in that frame, and is independent of its acceleration or higher derivatives. The experiment of Bailey et al. referenced above stored muons in a magnetic storage ring and measured their lifetime. While being stored in the ring they were subject to a proper acceleration of approximately 1018 g (1 g = 9.8 m/s2). The observed agreement between the lifetime of the stored muons with that of constant-velocity muons with the same energy partly confirms the clock postulate for accelerations of that magnitude. We must say "partly" here, because these accelerations were centripetal: that is, they were perpendicular to the muons' velocity, and contained no component parallel to that velocity. Thus, using this restricted type of acceleration only partly tested the clock postulate.

6. Tests of Relativistic Dynamics

Dynamics is the study of how energy and momentum conservation laws constrain and affect physical interactions. The two predictions of SR in this regard are that massive objects will have a limiting speed of c (the speed of light), and that their "relativistic mass" will increase with their speed. This latter property implies that the newtonian equations for conservation of energy and momentum will be violated by enormous factors for objects with velocities approaching c, and that the corresponding formulas of SR must be used. This has become so obvious in particle experiments that few experiments test the SR equations, and virtually all particle experiments rely upon SR in their analysis. The exceptions are primarily early experiments measuring energy as a function of speed for electrons and protons.

Note that the nomenclature has changed over the past century, and current literature focusses more on rest mass than relativistic mass because rest mass is an invariant property of an object. In this article, use of the word "mass" means rest mass. See also this FAQ page.

Elastic Scattering

Experiments that Show the Limiting Speed c

Electron Relativistic Mass Variations

In the early 20th century there was an alternative theory by Abraham that is now little known, because these experiments rejected it in favor of SR. A critical review of the experimental evidence concerning the Lorentz model compared to the Abraham model was given in: Farago and Jannossy, Il Nuovo Cim. Vol5, No 6, pg 1411 (1957).

Proton Relativistic Mass Variations

Calorimetric Test of Special Relativity

7. Tests of Length Contraction

At this time there are no direct tests of length contraction, as measuring the length of a moving object to the precision required has not been feasible. There is, however, a demonstration that it occurs:

A current-carrying wire is observed to be electrically neutral in its rest frame, and a nearby charged particle at rest in that frame is unaffected by the current. A nearby charged particle that is moving parallel to the wire, however, is subject to a magnetic force that is related to its speed relative to the wire. If one considers the situation in the rest frame of a charge moving with the drift velocity of the electrons in the wire, the force is purely electrostatic due to the different length contractions of the positive and negative charges in the wire (the former are fixed relative to the wire, while the latter are mobile with drift velocities of a few mm per second). This approach gives the correct quantitative value of the magnetic force in the wire frame. This is discussed in more detail in: Purcel, Electricity and Magnetism. It is rather remarkable that relativistic effects for such a tiny speed explain the enormous magnetic forces we observe.

8. Recent Tests of CPT and Lorentz Invariance

The CPT theorem is a general property of quantum field theories that states (loosely) that any system must behave the same if one applies the CPT transform to it: invert all charges (C, charge conjugation), invert all spatial axes (P, parity inversion), and invert the direction of time (T, time inversion). While one cannot actually do any of that in the real world, one can perform experiments in which particles are replaced by antiparticles (C), one looks at situations in which left and right are interchanged (P), and the particles travel along similar paths but in opposite directions and have opposite spin polarizations (T).

Lorentz Invariance is the technical term for the statement that SR is valid. Any violation of CPT invariance implies a violation of Lorentz invariance; theories without Lorentz invariance need not have CPT invariance.

Cavity Experiments:

Particle-Based Experiments:

Clock-comparison experiments:

Astrophysical tests:

Vacuum Cerenkov radiation:

9. Other Experiments

The Fizeau Experiment

Fizeau measured the speed of light in moving mediums, most notably moving water. Fresnel proposed a "drag coefficient" that putatively described how strongly a moving material medium "dragged" the aether. SR predicts no aether but does predict that the speed of light in a moving medium differs from the speed in the medium at rest, by an amount consistent (to within experimental resolutions) with these experiments and with the Fresnel drag coefficient.

The Sagnac Experiment

Sagnac constructed a ring interferometer and measured its fringe shifts as it is rotated. Contrary to some uninformed claims, this experiment can be fully analyzed using SR, and the results are consistent with SR.

The Michelson and Gale Experiment

g−2 Experiments as a Test of Special Relativity

The value g is the gyromagnetic ratio of a particle, and is exactly 2 for a classical particle with charge and spin. So g−2 measures the anomalous magnetic moment of the particle, and can be used (via QED) as a test of SR.

The Global Positioning System (GPS)

While not really an experiment, and not really any sort of test of SR, the GPS is an interesting and useful system in which relativity plays an important part. In particular it has become the best and most economical method of highly accurate time transfer around the globe.

Lunar Laser Ranging

Cosmic Microwave Background Radiation (CMBR)

The CMBR is a diffuse and almost isotropic microwave radiation that apparently suffuses all of space. It is generally thought to be a relic of the big bang. While not really a test of SR, CMBR measurements may be of interest to some readers—there is a unique locally inertial frame near Earth in which its dipole moment is zero; this frame moves with speed ~370 km/s relative to the Sun.

The Constancy of Physical Constants

The Neutrality of Molecules

10. Experiments that Apparently are NOT Consistent with SR/GR

It is clear that most if not all of these experiments have difficulties that are unrelated to SR. In some cases the anomalous experiment has been carefully repeated and been shown to be in error (e.g. Miller, Kantor, Munera); in others the experimental result is so outrageous that any serious attempt to reproduce it is unlikely (e.g. Esclangon); in still other cases there are great uncertainties and/or unknowns involved (e.g. Marinov, Silvertooth, Munera, Cahill, Mirabel), and some are based on major conceptual errors (e.g. Marinov, Thimm, Silvertooth). In any case, at present no reproducible and generally accepted experiment is inconsistent with SR, within its domain of applicability. In the case of some anomalous experiments there is an aspect of this being a self-fulfilling prophecy (being inconsistent with SR may be considered to be an indication that the experiment is not acceptable). Note also that few if any standard references or textbooks even mention the possibility that some experiments might be inconsistent with SR, and there are also aspects of publication bias in the literature—many of these papers appear in obscure journals. Many of these papers exhibit various levels of incompetence, which explains their authors' difficulty in being published in mainstream peer-reviewed physics journals; the presence of major peer-reviewed journals here shows it is not impossible for a competently performed anomalous experiment to get published in them.

There is a common thread among most of these experiments: the experimenters make no attempt to measure and quantify the systematic effects that could affect or mimic the signal they claim to observe. And none of them perform a comprehensive error analysis, which is necessary for any experiment to be believable today—especially ones that purport to overturn the foundations of modern physics. For Esclangon and Miller this is understandable, as during their lifetimes the use of error bars and quantitative error analyses was not the norm; the modern authors have no such excuse. In several cases (Esclangon, Miller, Marinov, Torr and Kolen, Cahill) it is possible to perform an error analysis which shows that the experiment is not inconsistent with SR after all.

Another common thread among many of these experiments is the claim of "agreement with Miller's result" (Kantor, Marinov, Silvertooth, Torr and Kolen, Munera, Cahill). Miller was the first to claim to have measured the "absolute motion of the Earth", and his result has achieved a sort of "cult status" among people who doubt the validity of SR. The paper referenced below in the discussion of Miller's results shows conclusively that his result is wrong, and explains why in detail. So claims of "agreement with Miller" generate doubts about the validity of experiments making such claims (how likely is it that a valid result would "agree" with a demonstrably bogus result?).

A key point is: if one is performing an experiment and claiming that it completely overthrows the foundations of modern physics, one must make it bulletproof or it will not be believed or accepted. At a minimum this means that a comprehensive error analysis must be included, direct measurements of important systematic errors_must_ be performed, and whatever "signal" is found must be statistically significant. None of these experiments come anywhere close to making a convincing case that they are valid and refute SR. This is based on a basic and elementary analysis of the experimenters' technique, not on the mere fact that they disagree with the predictions of SR. Most of these experiments are shown to be invalid (or at least not inconsistent with SR) by a simple application of the elementary error analysis or other arguments relating to error bars, showing how important that is to the believability of a result—the authors merely found patterns:

Amateurs look for patterns, professionals look at error bars.

All that being said, I repeat: as of this writing there are no reproducible and generally accepted experiments that are inconsistent with SR, within its domain of applicability.

Elementary Error Analysis of an Average

When multiple measurements of a single quantity are made, their mean provides the best estimate for the actual value of the quantity being measured. But this value is not perfect, and there is still uncertainty in the estimate. A histogram of the original measurements can provide an error estimate for the mean: the best estimate for the error bar on the mean comes from the r.m.s. variance of the histogram (i.e. its σ). If the original values are all statistically independent, and there are Nof them, then the best estimate of the error bar on the mean is σ/√N (this comes from the central limit theorem of statistics). That is a lower bound for the error bar on the mean. But if the original measurements are not truly independent, such as when some systematic effect is present, then the error bar on the mean will be larger. For a purely systematic error, the error bar on the mean is σ (independent of N), because one does not know which of the original measurements is correct. This is not necessarily an upper bound on the error bar because additional errors could be present, such as calibration errors of the instrumentation.

It is a fact of arithmetic that when averaging data one will obtain an answer, but the above error analysis is required to know whether or not it is significant. As a rule of thumb, a signal that is 5σ (or more) from zero is difficult or impossible to ignore; a "signal" that is less than 3σ from zero is unconvincing at best. The challenge is usually in determining what the σ actually is; but for an average, σ/√N gives a lower bound that is indisputable.

Usually the averaging of data is unwarranted, and in most cases one can apply an analysis to the entire data sequence—one should normally fit a parameterized theoretical expression to such a data sequence. So if an experiment measures a series of fringe positions as the apparatus is rotated, the theoretical fringe position should be parameterized as a function of orientation, and the parameters fit to the entire measurement sequence. A parametrisation of backgrounds and/or systematic errors should be included. Such a fit inherently provides error bars on the resulting parameter values. This is vastly better than averaging the data taken at each orientation and looking for patterns in the averages, because such averaging introduces artifacts, because averaging cannot distinguish between orientation dependence and systematic drifts, and because the fit inherently accounts for correlations in the parameters that averaging ignores. See arXiv:physics/0608238 for examples of both the artifacts introduced by averaging (Section III), and an analysis performed without such averaging (Section IV).

Error bars have become such an important part of modern experimental physics that it is not uncommon to make multiple measurements of a quantity, or to split one sequence of measurements into multiple smaller sequences, specifically so the error bar on the result can be estimated.

Note that the word "error" here is standard terminology, and is used in the sense of "uncertainty" rather than "mistake". For well-designed experiments, care is taken to minimize the backgrounds and systematic errors, and major systematic errors are measured; then a comprehensive error analysis is performed and used to quantify the resolutions and significance of the results. For most experiments in this section the authors simply did not do this. Prior to 1950 or so that was common and accepted practice; today it is not acceptable at all.

Experimenter's Bias

Experimenter's bias is a phenomenon caused by the inability of human participants in an experiment to remain completely objective, in which the human experimenter directly influences the experiment's outcome based upon his or her personal desires or expectations. It is most commonly a concern in medical and sociological experiments, in which "single-blind" and "double-blind" protocols are usually required. But some physical experiments in which a human observer is required to round measurements off can also be subject to it. In the experiments here, the conditions for this are the combination of a signal smaller than the actual measurement resolution, and an over-averaging of the data used to extract the "signal" from the measurements.

In principle, if a measurement has a resolution of R, then if the experimenteraverages N independent measurements the average will have a resolution of R/√N (this is just an application of the error analysis above). This is an important experimental technique used to reduce the impact of randomness on an experiment's outcome. But note that this requires that the measurements be statistically independent, and there are several reasons why that may not be true—if so then the average may not actually be a better measurement but may merely reflect the correlations among the individual measurements and their non-independent nature.

The most common cause of non-independence is systematic errors (errors affecting all measurements equally, causing the different measurements to be highly correlated, so the average is no better than any single measurement). But another cause can be due to the inability of a human observer to round off measurements in a truly random manner. If an experiment is searching for a sidereal variation of some measurement, and if the measurement is rounded off by a human who knows the sidereal angle of the measurement, and if hundreds of measurements are averaged to extract a "signal" that is smaller than the apparatus's actual resolution, then it should be clear that this "signal" can come from the non-random round-off, and not from the apparatus itself. In such cases a single-blind experimental protocol is required; if the human observer does not know the sidereal angle of the measurements, then even though the round-off is not random, it cannot introduce a spurious sidereal variation.

Note that modern electronic and/or computerized data acquisition techniques have greatly reduced the likelihood of such bias, but it can still be introduced by a poorly designed analysis technique. Experimenter's bias was not well recognized until the 1950s and 1960s, and then it was primarily in medical experiments and studies. Its effects on experiments in the physical sciences have not always been fully recognized. Several experiments referenced above were clearly affected by it.

Publication Bias

There are two very different aspects of publication bias:

  1. Unpopular or unexpected results may not be published because either the original experimenters or some journal referees have misgivings or reservations about the results, based on the results themselves and not any independent evaluation of experimental procedures or technique.
  2. Expected experimental results may not be published because either the original experimenters or some journal referees do not consider them interesting enough to merit publication.

In both cases the experimental record in the literature does not fully and accurately reflect the actual experiments that have been performed. Both of these effects clearly affect the literature on experimental tests of SR. This second aspect is one reason why this list of experiments is incomplete; there have probably been many hundreds of unpublished experiments that agree with SR.

Note that this does not include papers that are rejected for other reasons, such as: inappropriate subject or style, major internal inconsistencies, or downright incompetence on the part of authors or experimenters. Such rejections are not bias, they are the proper functioning of a peer-reviewed journal.

11. Acknowledgments

My interest in the experimental basis of SR has been piqued by many discussions in the newsgroup _sci.physics.relativity_about how well SR has or has not been confirmed or refuted. One effect of this is that I have assembled a rather large collection of papers on experimental tests of SR; this FAQ page is in some sense an index to this collection. Most of the descriptions above are my summaries direct from primary sources. In some cases the original paper was unavailable to me and I have relied on secondary sources (primarily the previous version of this FAQ page, and the books by Zhang, by Born, and by von Laue). — Tom Roberts