Rules of probability in quantum mechanics (original) (raw)

Rules of Probability in Quantum Mechanics 1{ }^{1}

Leon Cohen 2{ }^{2}
Received October 12, 1987

Abstract

We show that the quantum mechanical rules for manipulating probabilities follow naturally from standard probability theory. We do this by generalizing a result of Khinchin regarding characteristic functions. From standard probability theory we obtain the methods usually associated with quantum theory; that is, the operator method, eigenvalues, the Born rule, and the fact that only the eigenvalues of the operator have nonzero probability. We discuss the general question as to why quantum mechanics seemingly necessitates different methods than standard probability theory and argue that the quantum mechanical method is much richer in its ability to generate a wide variety of probability distributions which are inaccessible by way of standard probability theory.

I. INTRODUCTION

Although quantum mechanics is a probabilistic theory, it differs radically from standard probability theory in the manner that it calculates and manipulates probabilities and expectation values. The aim of this paper is to consider whether the methods of quantum mechanics follow from standard probability theory and to understand which physical aspects necessitate the unique methods used in quantum mechanics. We will show that standard probability theory does contain the necessary apparatus to naturally include quantum mechanics. We do this by generalizing a theorem due to Khinchen and showing that the probabilistic rules of quantum mechanics follow from the methods of standard probability theory.

In quantum mechanics one generally manipulates state functions and calculates expectation values via the operator method, while in standard

[1]

1{ }^{1} It is a pleasure to dedicate this paper to David Bohm in honor of his 70th birthday.
2{ }^{2} Department of Physics and Astronomy, Hunter College of the City University, 695 Park Avenue, New York, New York 10021. This work is supported in part by The City University Research Award Program. ↩︎

probability theory one deals directly with the probabilities and calculates expectation values by way of phase space integration. In addition the method of constructing probabilities for other variables is drastically different. For the sake of notation and reference we explicitly contrast the two methods.

Standard Probability Theory 3{ }^{3}

(a) deals directly with the probability, P(x)P(x) of the random variable xx,
(b) calculates the expectation value of an observable represented by a function α(x)\alpha(x) by

⟨α(x)⟩=∫α(x)P(x)dx\langle\alpha(x)\rangle=\int \alpha(x) P(x) d x

α=α(x)\alpha=\alpha(x)

P(α)=∫δ(α−α(x))P(x)dxP(\alpha)=\int \delta(\alpha-\alpha(x)) P(x) d x

Quantum Mechanics (1){ }^{(1)}

(a) deals with state functions, ψ(x)\psi(x), and the probabilities are given by

P(x)=∣ψ(x)∣2P(x)=|\psi(x)|^{2}

(b) obtains expectation values by associating an observable with an operator, A, and calculating

⟨A⟩=∫ψ∗(x)Aψ(x)dx\langle\mathbf{A}\rangle=\int \psi^{*}(x) \mathbf{A} \psi(x) d x

[1]

3{ }^{3} Notation: All integrals go from −∞-\infty to ∞\infty unless otherwise stated. Bold face symbols signify operators and generally operate on functions of xx. We frequently will use P(x)P(x) and P(α)P(\alpha) to denote the probability distribution of two different random variables xx and α\alpha, and properly they should be written as Px(x)P_{x}(x) and Px(α)P_{x}(\alpha) to connote two different functions. However, for the sake of simplicity in notation we shall understand that probability distributions with different arguments are not the same functions. Although we shall argue in this paper that the peculiar rules of quantum mechanics do follow from standard probability theory, we shall continue to make the distinction between the two methods and continue to use the phraseology standard probability theory to contrast with the quantum rules. Also, xx does not necessarily denote position as in what follows it can be a random variable corresponding to an observable or operator in any representation. ↩︎

P(α)=∣ϕ(α)∣2P(\alpha)=|\phi(\alpha)|^{2}

where

ϕ(α)=∫uα∗(x)ψ(x)dx\phi(\alpha)=\int u_{\alpha}^{*}(x) \psi(x) d x

and uα(x)u_{\alpha}(x) are the eigenfunctions of the operator A\mathbf{A},

Auα(x)=αuα(x)\mathbf{A} u_{\alpha}(x)=\alpha u_{\alpha}(x)

Furthermore, it is assumed that the only values which have nonzero probability are the eigenvalues.

Since the basic quantities in both methods are expectation values and probabilities, then why the dramatic difference in the methods of calculation? Can both methods fit into one scheme? What do we gain by the operator method of quantum mechanics which may not be contained in standard probability theory? These are the questions we shall address.

2. CHARACTERISTIC FUNCTIONS AND STATE FUNCTIONS

(a) Characteristic Functions and a Fundamental Theorem

In standard probability theory the characteristic function is defined as

M(θ)=∫eiθxP(x)dxM(\theta)=\int e^{i \theta x} P(x) d x

and it has certain manipulative advantages over the probability distribution itself. For example, it can be used to calculate the moments

⟨xn⟩=1in∂nM(θ)∂θn∣θ=0\left\langle x^{n}\right\rangle=\left.\frac{1}{i^{n}} \frac{\partial^{n} M(\theta)}{\partial \theta^{n}}\right|_{\theta=0}

Also, if the moments are known they can be used to construct the distribution, since from the moments we can calculate the characteristic function

M(θ)=∑n=0∞(iθ)nn!⟨xn⟩M(\theta)=\sum_{n=0}^{\infty} \frac{(i \theta)^{n}}{n!}\left\langle x^{n}\right\rangle

and invert it to get the distribution 4{ }^{4}

P(x)=12π∫M(θ)e−iθxdθP(x)=\frac{1}{2 \pi} \int M(\theta) e^{-i \theta x} d \theta

In general the characteristic function is a complex function; however, not every complex function is a characteristic function since it must be the fourier transform of some probability distribution. There have been many conditions derived to determine when a function is a characteristic function. One very important result is: (3,4){ }^{(3,4)} a complex valued function, M(θ)M(\theta), is a characteristic function if and only if there exists the representation

M(θ)=∫g∗(x)g(x+θ)dxM(\theta)=\int g^{*}(x) g(x+\theta) d x

for some function g(x)g(x). If there is such a function it is to be normalized to one,

∫∣g(x)∣2dx=1\int|g(x)|^{2} d x=1

To the best of my knowledge this theorem was first proved by Khinchen (3){ }^{(3)} some 50 years ago. Although it is a standard result, not much has been done with the interpretation or meaning of the function g(x)g(x), and apparently a name has not been associated with these functions. For reasons which will become obvious we shall call them state functions.

What does this have to do with quantum mechanics? If we rewrite Eq. (13) as

M(θ)=∫g∗(x)eθ(d/dx)g(x)dx=∫g∗(x)eiθ(1/i)(d/dx)g(x)dx\begin{aligned} M(\theta) & =\int g^{*}(x) e^{\theta(d / d x)} g(x) d x \\ & =\int g^{*}(x) e^{i \theta(1 / i)(d / d x)} g(x) d x \end{aligned}

and now take xx to be position and g(x)g(x) to be the wave function, we recognize the operator in the exponent to be proportional to the momentum operator. Since we are guaranteed that M(θ)M(\theta) is a proper characteristic function let us calculate the distribution function related to it. Using Eqs. (12) and (13) we have

P(α)=12π∬g∗(x′)g(x′+θ)e−iθαdx′dθP(\alpha)=\frac{1}{2 \pi} \iint g^{*}\left(x^{\prime}\right) g\left(x^{\prime}+\theta\right) e^{-i \theta \alpha} d x^{\prime} d \theta

[1]

4{ }^{4} There are cases where the moments do not determine a unique distribution. (2){ }^{(2)} ↩︎

and making the change of variables x=x′+θ;dx=dθx=x^{\prime}+\theta ; d x=d \theta, we have

P(α)=12π∬g∗(x′)g(x)e−i(x−x′)αdxdx′=12π∣∫g(x)e−ixαdx∣2\begin{aligned} P(\alpha) & =\frac{1}{2 \pi} \iint g^{*}\left(x^{\prime}\right) g(x) e^{-i\left(x-x^{\prime}\right) \alpha} d x d x^{\prime} \\ & =\frac{1}{2 \pi}\left|\int g(x) e^{-i x \alpha} d x\right|^{2} \end{aligned}

which is the quantum mechanical momentum distribution with α\alpha being the momentum.

Now, while there is a one-to-one relationship of the probability distribution to the characteristic function, that is not the case with the state functions. There are many state functions that yield the same probability distribution. To find the class of state functions which yields a given probability distribution we have from Eq. (19)

12π∫g(x)e−iαxdx=P(α)eiS(α)\frac{1}{\sqrt{2 \pi}} \int g(x) e^{-i \alpha x} d x=\sqrt{P(\alpha)} e^{i S(\alpha)}

where S(α)S(\alpha) is a arbitrary real function. Inverting we obtain

g(x)=12π∫P(α)eiS(α)+ixαdαg(x)=\frac{1}{\sqrt{2 \pi}} \int \sqrt{P(\alpha)} e^{i S(\alpha)+i x \alpha} d \alpha

To generate one particular state function from another, we use

g2(x)=12π∬g1(x′)ei[S2(α)−S1(α)+α(x−x′)]dαdx′g_{2}(x)=\frac{1}{2 \pi} \iint g_{1}\left(x^{\prime}\right) e^{i\left[S_{2}(\alpha)-S_{1}(\alpha)+\alpha\left(x-x^{\prime}\right)\right]} d \alpha d x^{\prime}

g2(x)=∫g1(x′)K(x−x′)dx′g_{2}(x)=\int g_{1}\left(x^{\prime}\right) K\left(x-x^{\prime}\right) d x^{\prime}

with

K(x)=12π∫eiS2(α)−iS1(α)+iαxdα=12π∫eiS(α)+iαxdα\begin{aligned} K(x) & =\frac{1}{2 \pi} \int e^{i S_{2}(\alpha)-i S_{1}(\alpha)+i \alpha x} d \alpha \\ & =\frac{1}{2 \pi} \int e^{i S(\alpha)+i \alpha x} d \alpha \end{aligned}

where we have set S=S2−S1S=S_{2}-S_{1} since both S2S_{2} and S1S_{1} are arbitrary. Any two gg 's related by (23) will give the same probability distribution.

The fact that there are an infinite number of state functions for a given probability is of course second nature when dealing in the quantum mechanical context. Perhaps the nonuniqueness is the reason why this aspect of standard probability theory has not been developed further.

However, we will argue that it is this richness which is required to accommodate the variety of possibilities needed when other random variables are discussed.

(b) Generalization of Fundamental Theorem

We now generalize the above theorem: M(θ)M(\theta) is a characteristic function if and only if for any arbitrary hermitian A\mathbf{A} there exists the representation

M(θ)=∫g∗(x)eiθAg(x)dxM(\theta)=\int g^{*}(x) e^{i \theta \mathbf{A}} g(x) d x

where again g(x)g(x) must be normalized to one. To prove this theorem we follow closely the proof presented by Lukacs (4){ }^{(4)} for the previous theorem.

We prove sufficiency by showing that a positive distribution is obtained from Eq. (26). Take

g(x)=∫uα(x)h(α)dα,h(α)=∫uα∗(x)g(x)dxg(x)=\int u_{\alpha}(x) h(\alpha) d \alpha, \quad h(\alpha)=\int u_{\alpha}^{*}(x) g(x) d x

where uα(x)u_{\alpha}(x) are the eigenfunctions of the operator A\mathbf{A} with eigenvalues α\alpha and where we assume a continuous spectrum. Substituting into Eq. (26) we have

M(θ)=∭uα∗(x)h∗(α′)eiθAuα(x)h(α)dxdα′dα=∭uα∗(x)h∗(α′)eiθαuα(x)h(α)dxdα′dα=∬h∗(α′)eiθαδ(α−α′)h(α)dα′dα=∫∣h(α)∣2eiθαdα\begin{aligned} M(\theta) & =\iiint u_{\alpha}^{*}(x) h^{*}\left(\alpha^{\prime}\right) e^{i \theta \mathbf{A}} u_{\alpha}(x) h(\alpha) d x d \alpha^{\prime} d \alpha \\ & =\iiint u_{\alpha}^{*}(x) h^{*}\left(\alpha^{\prime}\right) e^{i \theta \alpha} u_{\alpha}(x) h(\alpha) d x d \alpha^{\prime} d \alpha \\ & =\iint h^{*}\left(\alpha^{\prime}\right) e^{i \theta \alpha} \delta\left(\alpha-\alpha^{\prime}\right) h(\alpha) d \alpha^{\prime} d \alpha \\ & =\int|h(\alpha)|^{2} e^{i \theta \alpha} d \alpha \end{aligned}

Therefore, since M(θ)M(\theta) is the fourier transform of a positive function, it is a proper characteristic function. The probability is

P(α)=∣h(α)∣2P(\alpha)=|h(\alpha)|^{2}

Now suppose we have the probability distribution P(α)P(\alpha), then by the definition of M(θ)M(\theta) we have that

M(θ)=∫eiθαP(α)P(α)dxM(\theta)=\int e^{i \theta \alpha} \sqrt{P(\alpha)} \sqrt{P(\alpha)} d x

Expanding

P(x)=∫uα∗(x)g(x)dx=∫uα(x)g∗(x)dx\sqrt{P(x)}=\int u_{\alpha}^{*}(x) g(x) d x=\int u_{\alpha}(x) g^{*}(x) d x

and substituting into Eq. (33),

M(θ)=∭uα∗(x′)g(x′)eiθαuα(x)g∗(x)dxdx′dα∬uα∗(x′)g(x′)[eiθAuα(x)]g∗(x)dxdx′dα∬g(x′)[eiθAδ(x−x′)]g∗(x)dxdx′=∫g∗(x)eiθAg(x)dx\begin{aligned} M(\theta)= & \iiint u_{\alpha}^{*}\left(x^{\prime}\right) g\left(x^{\prime}\right) e^{i \theta \alpha} u_{\alpha}(x) g^{*}(x) d x d x^{\prime} d \alpha \\ & \iint u_{\alpha}^{*}\left(x^{\prime}\right) g\left(x^{\prime}\right)\left[e^{i \theta \mathbf{A}} u^{\alpha}(x)\right] g^{*}(x) d x d x^{\prime} d \alpha \\ & \iint g\left(x^{\prime}\right)\left[e^{i \theta \mathbf{A}} \delta\left(x-x^{\prime}\right)\right] g^{*}(x) d x d x^{\prime} \\ = & \int g^{*}(x) e^{i \theta \mathbf{A}} g(x) d x \end{aligned}

which proves that Eq. (26) is a necessary condition.
As we have done before we can find the most general g(x)g(x) which satisfies Eq. (26) for a given probability distribution. From Eq. (27) and (32)

h(α)=∫uα∗(x)g(x)dx=eiS(α)P(α)h(\alpha)=\int u_{\alpha}^{*}(x) g(x) d x=e^{i S(\alpha)} \sqrt{P(\alpha)}

g(x)=∫P(α)eiS(α)uα(x)dαg(x)=\int \sqrt{P(\alpha)} e^{i S(\alpha)} u_{\alpha}(x) d \alpha

Analogous to Eq. (23) we also have

g2(x)=∬g1(x′)ei(S2(α)iS1(α))uα∗(x′)uα(x)dαdx′g_{2}(x)=\iint g_{1}\left(x^{\prime}\right) e^{i\left(S_{2}(\alpha) i S_{1}(\alpha)\right)} u_{\alpha}^{*}\left(x^{\prime}\right) u_{\alpha}(x) d \alpha d x^{\prime}

g2(x)=∫g1(x′)K(x,x′)dx′g_{2}(x)=\int g_{1}\left(x^{\prime}\right) K\left(x, x^{\prime}\right) d x^{\prime}

with
K(x,x′)=∫ei[S2(α)−iS1(α)]uα∗(x′)uα(x)dα=∫eiS(α)uα∗(x′)uα(x)dαK\left(x, x^{\prime}\right)=\int e^{i\left[S_{2}(\alpha)-i S_{1}(\alpha)\right]} u_{\alpha}^{*}\left(x^{\prime}\right) u_{\alpha}(x) d \alpha=\int e^{i S(\alpha)} u_{\alpha}{ }^{*}\left(x^{\prime}\right) u_{\alpha}(x) d \alpha
The case discussed in Section (a) is obtained by taking

uα(x)=12πeixαu_{\alpha}(x)=\frac{1}{\sqrt{2 \pi}} e^{i x \alpha}

3. QUANTUM RULES FROM STANDARD PROBABILITY THEORY

We emphasize that everything we have done above is from within standard probability theory. Now, suppose that for a given probability distribution we put its characteristic function in the form given by Eq. (26). Let us see what results if we use the standard methods for calculating expectation values, probabilities, etc.

(a) Expectation Values

Using Eq. (10) and (26) we have

⟨A⟩=1i∂M(θ)∂θ∣θ=0=∫g(x)AeiθAg(x)dx∣θ=0=∫g∗(x)Ag(x)dx\begin{aligned} \langle\mathbf{A}\rangle & =\left.\frac{1}{i} \frac{\partial M(\theta)}{\partial \theta}\right|_{\theta=0} \\ & =\left.\int g(x) \mathbf{A} e^{i \theta \mathbf{A}} g(x) d x\right|_{\theta=0} \\ & =\int g^{*}(x) \mathbf{A} g(x) d x \end{aligned}

which is the usual quantum mechanical rule derived totally within standard probability theory.

(b) Measurable Quantities, Eigenvalues, Quantization and the Born Rule

Now we ask for the probability distribution. In a sense we have already done that in the proof in Section 2b. However, let us do it here for the discrete case. The probability distribution is given by Eq. (12) and (26)

P(α)=12π∫M(θ)e−iθαdθ=∬g∗(x)eiθAg(x)e−iθαdxdθP(\alpha)=\frac{1}{2 \pi} \int M(\theta) e^{-i \theta \alpha} d \theta=\iint g^{*}(x) e^{i \theta \mathbf{A}} g(x) e^{-i \theta \alpha} d x d \theta

We expand g(x)g(x) in terms of the eigenfunctions of A\mathbf{A} which we now take to be discrete.

g(x)=∑ncnun(x)Aun(x)=αnun(x)\begin{aligned} g(x) & =\sum_{n} c_{n} u_{n}(x) \\ \mathbf{A} u_{n}(x) & =\alpha_{n} u_{n}(x) \end{aligned}

Substituting in Eq. (47) we have that

P(α)=∫∑n,mcm∗um∗(x)eiθAcnun(x)e−iθαdxdθ=∫∑n,mcm∗um∗(x)eiθαncnun(x)e−iθαdxdθ=∑n∣cn∣2δ(α−αn)\begin{aligned} P(\alpha) & =\int \sum_{n, m} c_{m}^{*} u_{m}^{*}(x) e^{i \theta \mathbf{A}} c_{n} u_{n}(x) e^{-i \theta \alpha} d x d \theta \\ & =\int \sum_{n, m} c_{m}^{*} u_{m}^{*}(x) e^{i \theta \alpha_{n}} c_{n} u_{n}(x) e^{-i \theta \alpha} d x d \theta \\ & =\sum_{n}\left|c_{n}\right|^{2} \delta\left(\alpha-\alpha_{n}\right) \end{aligned}

Now what does this say? Clearly the only value which the random variable can take with nonzero probability are the eigenvalues and, furthermore, the probability for a particular one is ∣cn∣2\left|c_{n}\right|^{2}. This is pecisely the Born rule.

4. NECESSITY FOR QUANTUM RULES

We have shown that any continuous probability distribution can be written in terms of an infinite number of gg s such that the usual quantum rules can be used to calculate expectation values. That is, for any probability distribution we can use Eq. (5) to calculate its expectation values rather than the standard way, Eq. (1). In addition, if we have a discrete set of events we can also find a gg (nonunique) from which expectation values can be calculated in the same manner as in the continuous case. Thus both the discrete and continuous cases can be put into the same formalism and the discrete or continuous aspect of the random variable is carried in the operator and not in the probability or the gg s associated with the probability.

In addition, suppose we have a situation described by the probability distribution P(x)P(x), of a random variable xx, and suppose there is another physical quantity, yy, with probability P(y)P(y). In standard probability theory we would have to assume that there is a functional relation between xx and yy

y=y(x)y=y(x)

and the two probabilities would be related by

P(y)dy=P(x)dxP(y) d y=P(x) d x

If we know the distributions but do not know the functional relation between xx any yy we could obtain it by solving

dydx=P(x)P(y)\frac{d y}{d x}=\frac{P(x)}{P(y)}

What is the functional relation between yy and xx if we take,

P(y)=δ(y−y0)P(y)=\delta\left(y-y_{0}\right)

where y0y_{0} is a constant? Clearly there can not be any solution to

δ(y−y0)dy=P(x)dx\delta\left(y-y_{0}\right) d y=P(x) d x

If P(x)P(x) is smooth. That is, by use of standard probability theory we can not have eigenstates. Similar considerations hold if we have an infinite number of discrete values, that is if the probability distribution of yy is

P(y)=∑ipiδ(y−yi)P(y)=\sum_{i} p_{i} \delta\left(y-y_{i}\right)

Again, no function relationship exists. In standard probability theory, if a random variable is continuous and has a smooth distribution function, then a function relationship cannot give a sharp or discrete distribution for another variable. However, there is absolutely no difficulty in obtaining a discrete distribution with the rules of quantum mechanics. Both discrete and continuous can be related to the same function, namely the gg s. From the original continuous distribution we first solve for the set of gg s and then ask which of these will produce the particular discrete distribution. Hence, the operator method is considerably richer in its ability to produce a wide variety of distributions starting from a particular one.

There is a further stumbling block in standard probability theory which needs expansion and which the quantum formulation allows. Suppose we have two physical quantities and for a certain physical arrangement, AA, we have two probability distributions PxA(x)P_{x}^{A}(x) and PyA(y)P_{y}^{A}(y). By solving Eq. (56) we can obtain the functional relation between xx and yy. Now suppose we have a different situation, BB, and have different probability distributions for the same two physical quantities, PxB(x)P_{x}^{B}(x) and PyB(y)P_{y}^{B}(y) and again solve Eq. (6). In general, the two solutions will not be the same. Indeed, if we take the PxA(x)=PxB(x)P_{x}^{A}(x)=P_{x}^{B}(x), then in standard probability theory we are forced to take PyA(y)=PyB(y)P_{y}^{A}(y)=P_{y}^{B}(y) if the relationship between xx and yy is a fixed functional one. That is if we want total freedom with the probabilities we are forced into a relationship between xx and yy which is not universal but depends on the situation. However, in the formulation in terms of state functions where the relationship between observables is a fixed operator relation, we have the freedom to have a variety of distributions for the observable yy for different physical situations even though the distribution for variable xx is the same. The reason for this is that even though PxA(x)=PxB(x)P_{x}^{A}(x)=P_{x}^{B}(x), the gg s for the two situations do not have to be the same and hence they can generate different

distributions for other random variables. That is if PxA(x)=PyB(x)P_{x}^{A}(x)=P_{y}^{B}(x) then we do not have to equate gxA(x)g_{x}^{A}(x) to gxB(x)g_{x}^{B}(x) and hence we are not forced into equating PyA(y)P_{y}^{A}(y) to PyB(y)P_{y}^{B}(y).

5. TWO DIMENSIONS

We now generalize to two dimensions. The theorem of Section (2b) generalizes directly: M(θ,τ)M(\theta, \tau) is a two dimensional characteristic function if and only if there exists a function g(x,y)g(x, y) such that, for two arbitrary commuting operators A\mathbf{A} and B\mathbf{B},

M(θ,τ)=∬g∗(x,y)eiθA+iτBg(x,y)dxdyM(\theta, \tau)=\iint g^{*}(x, y) e^{i \theta \mathbf{A}+i \tau \mathbf{B}} g(x, y) d x d y

where we normalize gg,

∬∣g(x,y)∣2dxdy=1\iint|g(x, y)|^{2} d x d y=1

That fact that A\mathbf{A} and B\mathbf{B} have to commute is essential. A spcial case is

A=1i∂∂x,B=1i∂∂y\mathbf{A}=\frac{1}{i} \frac{\partial}{\partial x}, \quad \mathbf{B}=\frac{1}{i} \frac{\partial}{\partial y}

which gives

M(θ,τ)=∬g(x,y)g(x+θ,y+τ)dxdyM(\theta, \tau)=\iint g(x, y) g(x+\theta, y+\tau) d x d y

To obtain the distribution associated with the above characteristic function, we expand g(x,y)g(x, y) in terms of the eigenfunctions of A\mathbf{A} and B\mathbf{B}

Auα(x)=αuα(x),Bvβ(y)=βvβ(y)g(x,y)=∬G(α,β)uα(x)vβ(y)dαdβ\begin{aligned} & \mathbf{A} u_{\alpha}(x)=\alpha u_{\alpha}(x), \quad \mathbf{B} v_{\beta}(y)=\beta v_{\beta}(y) \\ & g(x, y)=\iint G(\alpha, \beta) u_{\alpha}(x) v^{\beta}(y) d \alpha d \beta \end{aligned}

Substituting in Eq. (57) we get

M(θ,τ)=∬∣G(α,β)∣2eiαθ+iβτdαdβM(\theta, \tau)=\iint|G(\alpha, \beta)|^{2} e^{i \alpha \theta+i \beta \tau} d \alpha d \beta

which shows that the distribution is

P(α,β)=∣G(α,β)∣2P(\alpha, \beta)=|G(\alpha, \beta)|^{2}

where

G(α,β)=∬g(x,y)uα∗(x)vβ∗(y)dxdyG(\alpha, \beta)=\iint g(x, y) u_{\alpha}^{*}(x) v_{\beta}^{*}(y) d x d y

6. JOINT DISTRIBUTIONS FOR GIVEN MARGINALS

We now ask how one can relate two-dimensional joint distributions to the individual marginals and, in particular, how to find joint distributions which have prescribed one-dimensional marginals. It is generally accepted in quantum mechanics that this can not be done for noncommuting observables because of the uncertainty principle; we shall show otherwise. Let us first consider whether this problem can be approached from the general formulation given in terms of state functions. Suppose we want to get a joint distribution which satisfies the given marginals,

∫P(x,y)dy=P1(x),∫P(x,y)dx=P2(y)\int P(x, y) d y=P_{1}(x), \quad \int P(x, y) d x=P_{2}(y)

Equivalently we can ask for the characteristic function M(θ,τ)M(\theta, \tau) which satisfies

M(θ,0)=M1(θ)=∫eiθxP1(x)dxM(0,τ)=M2(τ)=∫eiτyP2(y)dy\begin{aligned} & M(\theta, 0)=M_{1}(\theta)=\int e^{i \theta x} P_{1}(x) d x \\ & M(0, \tau)=M_{2}(\tau)=\int e^{i \tau y} P_{2}(y) d y \end{aligned}

Now it is easy to construct such functions, for example

M(θ,τ)=M1(θ)M2(τ)f(θ,τ)M(\theta, \tau)=M_{1}(\theta) M_{2}(\tau) f(\theta, \tau)

where f(θ,τ)f(\theta, \tau) is any function such that

f(θ,0)=f(0,τ)=1f(\theta, 0)=f(0, \tau)=1

In general M\mathbf{M} will not be a proper characteristic function and hence the distribution will not be positive. However, the problem can be formulated in terms of state functions which will guarantee a proper characteristic function. In particular if we found a function a g(x,y)g(x, y) which satisfies

∬g∗(x,y)g(x+θ,y)dxdy=M1(θ)=∫g1∗(x)g1(x+θ)dx∬g∗(x,y)g(x,y+τ)dxdy=M2(τ)=∫g2∗(y)g2(y+τ)dy\begin{aligned} & \iint g^{*}(x, y) g(x+\theta, y) d x d y=M_{1}(\theta)=\int g_{1}^{*}(x) g_{1}(x+\theta) d x \\ & \iint g^{*}(x, y) g(x, y+\tau) d x d y=M_{2}(\tau)=\int g_{2}^{*}(y) g_{2}(y+\tau) d y \end{aligned}

for a given g1(x)g_{1}(x) and g2(y)g_{2}(y) then we would be assured that the probability distribution associated with it is proper and has the right marginal distributions. The distribution would be given by the fourier transform of Eq. (63). Since our main point here is to just show that positive joint

distributions always exist we will explain the construction of them directly. We emphasize that xx and yy can be any random variables, and whether they are associated with commuting or noncommuting operators is immaterial. In the particular case of position and momentum we would take

P1=∣ψ(q)∣2,P2=∣ϕ(p)∣2P_{1}=|\psi(q)|^{2}, \quad P_{2}=|\phi(p)|^{2}

where ψ(q)\psi(q) and ϕ(p)\phi(p) are the position and momentum wave functions respectively.

An infinite number of joint distributions satisfying Eq. (9) can be generated as follows. (5){ }^{(5)} Chose any positive function Ω(u,v)\Omega(u, v) of the two variables u,vu, v such that 5{ }^{5}

∫01Ω(u,v)dv=1,∫01Ω(u,v)du=1\int_{0}^{1} \Omega(u, v) d v=1, \quad \int_{0}^{1} \Omega(u, v) d u=1

and construct 6{ }^{6}

P(x,y)=P1(x)P2(y)Ω(u,v)P(x, y)=P_{1}(x) P_{2}(y) \Omega(u, v)

where for uu and vv we substitute

u(x)=∫−∞xP1(x′)dx′,v(y)=∫−∞yP2(y′)dy′u(x)=\int_{-\infty}^{x} P_{1}\left(x^{\prime}\right) d x^{\prime}, \quad v(y)=\int_{-\infty}^{y} P_{2}\left(y^{\prime}\right) d y^{\prime}

To show that the marginals are satisfied we integrate with respect to yy

∫−∞∞P(x,y)dy=P1(x)∫−∞∞P2(y)Ω(u,v)dy=P1(x)∫01Ω(u,v)dv=P1(x)\begin{aligned} \int_{-\infty}^{\infty} P(x, y) d y & =P_{1}(x) \int_{-\infty}^{\infty} P_{2}(y) \Omega(u, v) d y \\ & =P_{1}(x) \int_{0}^{1} \Omega(u, v) d v=P_{1}(x) \end{aligned}

5{ }^{5} Functions satisfying Eq. (77) can readily be constructed. Chose any function h(u,v)h(u, v) and normalize it so that

∫01∫01h(u,v)dudv=1\int_{0}^{1} \int_{0}^{1} h(u, v) d u d v=1

and define

h1(u)=∫01h(u,v)dv,h2(v)=∫01h(u,v)duh_{1}(u)=\int_{0}^{1} h(u, v) d v, \quad h_{2}(v)=\int_{0}^{1} h(u, v) d u

Then, Ω(u,v)\Omega(u, v) defined by

Ω(u,v)=h(u,v)−h1(u)h2(v)+1\Omega(u, v)=h(u, v)-h_{1}(u) h_{2}(v)+1

or by

Ω(u,v)=h(u,v)−h1(u)−h2(v)+2\Omega(u, v)=h(u, v)-h_{1}(u)-h_{2}(v)+2

satisfies Eq. (77).
6{ }^{6} There are a number of different ways to write (5,8){ }^{(5,8)} Eq. (78).

The last step follows since dv=P2(y)dyd v=P_{2}(y) d y. Similarly for integration with respect to xx. This procedure for constructing joint distributions is a generalization of previous known specific cases. (6){ }^{(6)} Generalization to the multidimensional case has been given (7){ }^{(7)} and it has been shown that the method yields all possible distributions. (8,9){ }^{(8,9)} We note that Ω(u,v)\Omega(u, v) may be a functional of the marginals directly, that is not only through uu and vv. In addition we point out that if we start with a correlated function P(x,y)P(x, y), additional ones can be constructed from

Pnew (x,y)=kP(x,y)+(1−k)P1(x)P2(y)Ω(u,v)P_{\text {new }}(x, y)=k P(x, y)+(1-k) P_{1}(x) P_{2}(y) \Omega(u, v)

where kk is any constant between 0 and 1 . Hence, for any two marginal distributions, whether they refer to commuting operators or not, one can always find an infinite number of positive joint distributions.

It has been generally assumed that proper joint distributions for noncommuting operators can not exist in quantum mechanics. This has led to the view that we have to settle for joint distributions which are not positive. Many such quasidistributions have been used with considerable profit although they can not be interpreted as true probabilities. For contrast sake we briefly review them. The first such distribution proposed was that of Wigner (10){ }^{(10)}

W(q,p)=12π∫ψ∗(q−12τℏ)e−iτpψ(q+12τℏ)dτW(q, p)=\frac{1}{2 \pi} \int \psi^{*}\left(q-\frac{1}{2} \tau \hbar\right) e^{-i \tau p} \psi\left(q+\frac{1}{2} \tau \hbar\right) d \tau

and has been generalized to arbitrary operators. (11,12){ }^{(11,12)} There are an infinite number of such quasidistributions any one of which can be used to formulate quantum mechanics in a classical type phase space. They can be generated from (13){ }^{(13)}
P(q,p)=14π2∭e−iθq−iτp+iθuf(θ,τ)ψ∗(u−12τℏ)ψ(u+12τℏ)dudτdθP(q, p)=\frac{1}{4 \pi^{2}} \iiint e^{-i \theta q-i \tau p+i \theta u} f(\theta, \tau) \psi^{*}\left(u-\frac{1}{2} \tau \hbar\right) \psi\left(u+\frac{1}{2} \tau \hbar\right) d u d \tau d \theta
To reproduce the quantum expectation values by way of phase space integration,

∫ψ∗(q)G(Q,P)ψ(q)dq=∫g(q,p)P(q,p)dqdp\int \psi^{*}(q) G(\mathbf{Q}, \mathbf{P}) \psi(q) d q=\int g(q, p) P(q, p) d q d p

we must have correspondence between quantum operators and classical functions. The relation between the two must be

G(Q,P)=∬γ(θ,τ)f(θ,τ)eiθQ+iτPdθdτG(\mathbf{Q}, \mathbf{P})=\iint \gamma(\theta, \tau) f(\theta, \tau) e^{i \theta \mathbf{Q}+i \tau \mathbf{P}} d \theta d \tau

where

γ(θ,τ)=14π2∬g(q,p)eiθq+iτpdxdp\gamma(\theta, \tau)=\frac{1}{4 \pi^{2}} \iint g(q, p) e^{i \theta q+i \tau p} d x d p

or equivalently

G(Q,P)=14π2∬g(q,p)f(θ,τ)eiθ(Q−q)+iτ(P−p)dθdτG(\mathbf{Q}, \mathbf{P})=\frac{1}{4 \pi^{2}} \iint g(q, p) f(\theta, \tau) e^{i \theta(\mathbf{Q}-q)+i \tau(\mathbf{P}-p)} d \theta d \tau

The unique feature of these quasidistributions is that they are bilinear in the state function. It is well known that such distributions can never be positive. This was always considered to be a reflection of the special properties of quantum mechanics and the uncertainty principle. This is not the case since positive distributions do exist and from the general form we have given in Section 6 we see that there is no reason for a joint distribution to be bilinear. In fact the general functional dependence of a joint distribution on the state function is not bilinear. Even for the simple case of the uncorrelated joint distribution the dependence is fourth order in the state function since each marginal is bilinear in the state function. (14){ }^{(14)}

As previously pointed out there are an infinite number of joint distributions for given marginals. In quantum mechanics there is a further freedom in that the probability of momentum and position do not uniquely determine the wave function. This was pointed out by Reichenbach 15{ }^{15} who attributes to Bargamann a method for constructing different wave functions which have the same absolute square of position and momentum. Vogt 16{ }^{16} and Altes 17{ }^{17} also give methods of constructing such functions. Consider a wave ψ1(q)\psi_{1}(q) whose complex conjugate is given by

ψ1∗(q)=ψ1(−q)eiη(q)\psi_{1}^{*}(q)=\psi_{1}(-q) e^{i \eta(q)}

where η(q)\eta(q) is some real function. Now define another wave function, ψ2(q)\psi_{2}(q), by

ψ2(q)=ψ1(q)eiη(q)\psi_{2}(q)=\psi_{1}(q) e^{i \eta(q)}

It is straightforward to show that the momentum wave functions are related by

ϕ2(q)=ϕ1∗(q)\phi_{2}(q)=\phi_{1}^{*}(q)

and hence ∣ψ1(q)∣2=∣ψ2(q)∣2\left|\psi_{1}(q)\right|^{2}=\left|\psi_{2}(q)\right|^{2} and ∣ϕ1(p)∣2=∣ϕ2(p)∣2\left|\phi_{1}(p)\right|^{2}=\left|\phi_{2}(p)\right|^{2}. I am not aware of any general criteria for characterizing those wave functions which are uniquely determined by their probability distributions of position and momentum. For the method described above to construct proper joint distributions, uniqueness may be achieved by making Ω(u,v)\Omega(u, v) a functional of the wave function.

7. CONCLUSION

We have shown that the seemingly peculiar rules for manipulating probabilities in quantum mechanics are derivable from standard probability theory. Specifically we have shown that starting with any probability distribution, we can write its characteristic function in the form given by Eq. (26), that is, in terms of operators and what we have called state functions. Once that is done, the manipulative rules of quantum mechanics follow from the usual rules of probability theory. In the state function method there is not a unique relationship between the probability and state function. This allows, starting from a particular distribution to generate distributions for other variables which are not accessible by may of the standard method.

REFERENCES

D. Bohm, Quantum Theory (Prentice-Hall, New York, 1951).
W. Feller, An Introduction to Probability Theory and its Applications, Vol. 2 (Wiley, New York, 1971).
A. Khinchin, Bull. Univ. Moscow 1 (1937).
E. Lukacs, Characteristic Functions (Charles Griffin, London, 1970).
L. Cohen and Y. I. Zaparovanny, J. Math. Phys. 21, 794 (1980); L. Cohen, in Frontiers of Nonequilibrium Statistical Physics, G. T. Moore and M. O. Scully, eds. (Plenum, New York, 1986).
See, for example, the following papers and the references therein: D. Morgenstern, Mitt. Math. Stat. 8, 234 (1956); E. J. Gumbel, Rev. Fac. Ci. Univ. Lisboa Ser. 2A CI. Mat. 7, 179 (1959); D. J. G. Farlie, Biometrica 50, 499 (1963); J. S. Huang and S. Kotz, Biometrica 71, 633 (1981); S. Kotz and N. Johnson, C.R. Acad. Sc. Paris A285, 277 (1977); B. Schweizer and A. Sklar, Probabilistic Metric Spaces (North Holland, New York, 1983); S. Cambanis, J. Multivariate Anal. 7, 551 (1977).
L. Cohen, J. Math. Phys. 25, 2402 (1984).
P. D. Finch and R. Groblicki, Found. Phys. 14, 549 (1984).
B. Schweizer and A. Sklar, Found. Phys. 16, 1061 (1986).
E. Wigner, Phys. Rev. 40, 749 (1932).
A. O. Barut, Phys. Rev. 108, 656 (1957).
M. Scully and L. Cohen, Found. Phys. 16, 295 (1986); in The Physics of Phase Space, Y. S. Kim and W. W. Zachary, eds. (Springer, New York, 1987).
L. Cohen, J. Math. Phys. 7, 781 (1966); 17, 1863 (1976).
L. Cohen, in New Techniques and Ideas in Quantum Measurement Theory, D. Greenberger, ed. (New York Academy of Sciences, New York, 1986).
H. Reichenbach, Philosophical Foundations of Quantum Mechanics, (University of California Press, Berkeley, 1944).
A. Vogt, in Mathematical Foundations of Quantum Theory, A. R. Marlow, ed. (Academic Press, New York, 1978).
R. Altes, personal communication (Technical Memorandum 322, Orincon, April 1984).