Axioms 2015, 4, 1-29; doi:10.3390/axioms4010001
OPEN ACCESS
axioms
ISSN 2075-1680
www.mdpi.com/journal/axioms
Article
Positive-Operator Valued Measure (POVM) Quantization
Jean Pierre Gazeau 1,2, * and Barbara Heller 3
1
Astroparticules et Cosmologie (APC, UMR 7164), Université Paris 7-Paris Diderot, Sorbonne Paris
Cité, 75205 Paris, France
2
Centro Brasileiro de Pesquisas Físicas, 22290-180 - Rio de Janeiro, RJ, Brazil
3
Department of Applied Mathematics, Illinois Institute of Technology, Chicago, IL 60616, USA;
E-Mail:
[email protected]
* Authors to whom correspondence should be addressed; E-Mail:
[email protected];
Tel.: +331-5727-6049; Fax: +331-5727-6071.
Academic Editor: James D. Malley
Received: 3 September 2014 / Accepted: 18 December 2014 / Published: 25 December 2014
Abstract: We present a general formalism for giving a measure space paired with a separable
Hilbert space a quantum version based on a normalized positive operator-valued measure.
The latter are built from families of density operators labeled by points of the measure
space. We especially focus on various probabilistic aspects of these constructions. Simple or
more elaborate examples illustrate the procedure: circle, two-sphere, plane and half-plane.
Links with Positive-Operator Valued Measure (POVM) quantum measurement and quantum
statistical inference are sketched.
Keywords: POVM; quantization; covariance; density operators; quantum measurement
1. Introduction
In this paper, we propose a quantum analysis, generally non-commutative, of a measure
space based on a (normalized) positive-operator valued measure ((N)POVM) (in order not to
spoil the text with too many acronyms, we will keep “POVM” in our paper to designate a
normalized positive operator-valued measure) built from a density matrix or operator (in the
quantum mechanics terminology) acting on some separable Hilbert space. One key aspect of
Axioms 2015, 4 2
the procedure is its probabilistic nature. Moreover, beyond the common mathematical language,
our approach has or might have some deep connection with quantum measurement based on
POVM [1], quantum probability (see, for instance, [2] and the references therein) or quantum statistical
inference (see, for instance, [3] and the references therein). In this respect, we recommend the clear and
concise introduction to the mathematics of quantum physics by Kuperberg [4].
Our work lies in the continuation of recent ones concerning what we named integral
quantization [5–9] and leading to applications shedding new light on the still problematic question of
the relation between classic and quantum worlds. The so-called coherent state (CS), or Berezin, or
Klauder, or anti-Wick, or Toeplitz quantizations are particular cases of those integral quantizations of
various measure sets.
Our conception of quantization rests upon a trivial observation. We notice that the formalism
of classical physics rests upon highly abstract mathematical models, mainly since the invention of
infinitesimal calculus, giving us the impression that improbable objects, like material phase space points,
are accessible to measurements. It is true that with an excellent approximation, most of the physical
phenomenon at our scale can be efficiently apprehended in that way. On the other hand, reasonably
realistic scientists know that such continuous models are highly idealistic and should be viewed as
such, whatever their powerful predictive qualities. Above all, we know that any attempt to maintain
our “classical” models together with our classical reading of them is not experimentally sustainable over
a wide range of phenomena. A quantization in a certain sense of our mathematical classical model
(Bohr–Sommerfeld, canonical Dirac, Feynman path integral, geometry, deformation, CS, etc. [10]) is
needed to account for observations and predictability. Usually, physicists or mathematicians have in
mind as a classical structure a phase space or symplectic one that matches Hamiltonian formalism. In
our mind, this represents a quite constraining restriction. With our approach, classical mathematical
models with minimal structure (like a measure) might also be amenable to their quantized versions in
our sense.
Now, we should answer the natural question “What is POVM quantization for?”. In quantum physics,
the answer is natural and experimentally justified. Some illuminating examples are given in our previous
works [5,9], where it has been shown that there is a world of quantizations leading to equivalent results
from a physical point of view [11]. Starting from general models, not necessarily endowed with some
physical flavor, it is interesting to provide a class of non-commutative, “fuzzy”, versions of them based
on normalized POVM and resultant classical probability distributions. The method can be particularly
relevant when we have to cope with geometries presenting singularities or with subset of manifolds
determined by constraints [12].
In Section 2, we recall the minimal requirements that any quantization procedure should obey. A
normalized positive operator-valued measure associated with the triple measure space, Hilbert space
and density operator is presented in Section 3. The probabilistic content of the formalism is developed
in Section 4. In Section 5, we reverse the approach by asking whether quantum formalism can be
directly produced from classical probability theory. In Section 6, we examine the particular case where
density operators are rank one, i.e., coherent state projectors. This allows a better understanding of the
material introduced in the three previous sections. With Section 7, we enter the heart of the subject by
explaining in which manner POVM quantization transforms a classical object, function or distribution
Axioms 2015, 4 3
into a linear operator in the companion Hilbert space. In Section 8, semi-classical aspects through lower
symbols are examined. Covariant POVM quantization based on unitary irreducible representations and
the relevant Schur’s lemma are described in Section 9. Then, we proceed with more or less elementary
illustrations of the method: unit circle (Section 10), unit two-sphere (Section 11), plane (Section 12) and,
finally, half-plane (Section 13). Some lines for future works and views about the links with quantum
measurements and statistical inference are discussed in Section 14. Some necessary material is given in
the two appendices.
2. Quantization: The Basics
First, on a minimal level, we understand the quantization of a set X and functions on it as a procedure
fulfilling three requirements: linearity, existence of identity and self-adjointness. More precisely,
quantization is:
(1) A linear map:
Q : C(X) → A(H) (1)
where C(X) is a vector space of complex-valued functions f (x) on a set X and A(H) is a vector
space (“vector space” is in a loose sense, since the linear superposition of two operators could have
a domain reduced to {0} in infinite-dimensional Hilbert space!) of linear operators:
Q(f ) ≡ Af (2)
in some complex Hilbert space H, such that;
(2) f = 1 is mapped to the identity operator I on H;
(3) A real function f is mapped to an (essentially) self-adjoint or, at least, symmetric
operator Af in H.
In a physical or a signal analysis context, one needs to add structure to X, such as a topology, a
manifold structure, a closure under algebraic operations, etc. Besides, one also has the freedom to
interpret the spectra of classical f ∈ C(X) or quantum Af ∈ A(H), so that they can be chosen
as observables (in the terminology used in physics). Finally, one may add the requirement of an
unambiguous classical limit of the quantum quantities, the limit operation being associated with a change
of scale.
3. POVM for a Measure Space
As announced in the Introduction, we start from a minimal set of objects:
(i) a measure space (X, B, ν) (or (X, ν) for short), where B is the σ-algebra of ν-measurable subsets,
(ii) a separable Hilbert space H,
(iii) an X-labeled family of positive semi-definite and unit trace operators (“density matrices or
operators”) on H,
X ∋ x 7→ ρ(x) ∈ L(H) , ρ(x) ≥ 0 , tr(ρ(x)) = 1 (3)
Axioms 2015, 4 4
and resolving the identity I on H,
Z
ρ(x) dν(x) = I , in a weak sense. (4)
X
If X is equipped with a suitable topology, then the normalized positive operator-valued measure
(POVM) mρ on the corresponding σ-algebra Bρ (X) of Borel sets is defined through the following map:
Z
B(X) ∋ ∆ 7→ mρ (∆) = ρ(x) dν(x) (5)
∆
4. Probabilistic Density on Measure Space from POVM
There is a straightforward consequence of the identity Equation (4) in terms of probability distribution
on the original measure space (X, ν). Given x0 ∈ X and applying the corresponding density operator
ρ(x0 ) on each side of Equation (4) leads to:
Z
ρ(x0 ) ρ(x) dν(x) = ρ(x0 ) (6)
X
Taking now the trace on each side gives:
Z
tr (ρ(x0 ) ρ(x)) dν(x) = tr (ρ(x0 )) = 1 (7)
X
Hence, the Hilbertian formalism combined with the original measure ν produces the X-labeled family
of probability distributions:
X ∋ x0 , x 7→ px0 (x) = tr (ρ(x0 ) ρ(x)) (8)
on (X, ν). The nonnegative bounded function px0 (x) ≤ 1 measures in a certain sense the degree of
localization of x w.r.t. x0 , and vice versa, due to the symmetry px0 (x) = px (x0 ), on the measure space
(X, ν). If we consider the particular case where ρ(x) is a rank-one projector operator:
ρ(x) = |xihx| , hx|xi = 1 (9)
i.e., is a “pure coherent state” (see below), then:
px0 (x) = |hx0 |xi|2 (10)
Thus, we could be inclined to introduce the pseudo-distance (triangular inequality is not verified
in general):
" #1/2 " #1/2
tr(ρ(x)ρ(x′ )) px (x′ )
δ(x, x′ ) := − ln p = − ln p (11)
tr((ρ(x))2 )tr((ρ(x′ ))2 ) px (x) px′ (x′ )
= δ(x′ , x) ∈ [0, ∞) , δ(x, x) = 0 (12)
Note that this quantity becomes infinite as px (x′ ) → 0. This limit corresponds to orthogonality of
vectors |xi and |x′ i in the pure CS case.
Axioms 2015, 4 5
Actually, from the fact that any density operator ρ is Hilbert–Schmidt, with norm
p p
kρk = trρρ† = trρ2 , it is exact and could appear as more natural to introduce the
associated distance:
p
dHS (x, x′ ) = kρ(x) − ρ(x′ )k = tr(ρ(x) − ρ(x′ ))2 (13)
In reality, this object forces any pair of points in X to be finitely separated, since we have:
p √ p √
dHS (x, x′ ) = tr ((ρ(x))2 + (ρ(x′ ))2 − 2ρ(x)ρ(x′ )) ≤ 2 1 − tr(ρ(x)ρ(x′ )) ≤ 2 (14)
In its general form, a density operator can be written as a statistical mixture of pure states:
X X
ρ(x) = pi (x)|ψi (x)ihψi (x)| , kψi (x)k = 1 , pi (x) = 1 , 0 ≤ pi (x) ≤ 1 (15)
i i
Then, the corresponding probability distributions on (X, ν) read as:
X
px0 (x) = pi (x0 )pj (x)|hψi (x0 )|ψj (x)i|2 (16)
i,j
This can be viewed as the average of the random variable |hψi (x0 )|ψj (x)i|2 ∈ [0, 1] with discrete
probability distribution (i, j) 7→ pi (x0 )pj (x).
From the point of view of Bayesian statistical inference, we may treat X as the “parameter space of
interest”, ν as a probability measure a priori on X and then px0 (x) as a probability density function on
X, a posteriori, given an “estimated” value x0 , where x0 derives as a datum from some related random
device with probability density function family indexed by x ∈ X. Then, we would be interested in an
associated distance function on X to determine intervals of “x−distance” around the observed value x0 .
Note that for this “inferred” probability distribution on X, we have a POV measure, not an orthogonal
one. From the inference point of view, the inferred probability distribution in this context, in principle,
does not have a “frequency” or “ensemble” interpretation, similar to the case for a POV measure. It
is the “random experiment” with the probability density function family indexed by x ∈ X, which, in
principle, is repeatable and which would derive from a projector-valued (PV) measure. For example, see
Section 6.
5. Quantum World from Classical Probabilistic Distribution?
In the previous section, we derived from the “quantum” four-tuple (X, ν, H, x 7→ ρ(x)) an X-indexed
family of “classical” probability distributions px0 (x) = tr (ρ(x0 ) ρ(x)). An interesting question then
arises: given such a classical family, is it possible to derive a quantum x 7→ ρ(x)? If yes, is there
uniqueness? Can we loosely think of quantum formalism as a kind of “square root” of classical
probability formalism, as the quantum spin emerges from “square roots” (e.g., Dirac) of scalar wave
equations (e.g., Klein–Gordon)?
Let us attempt through a simple example to explore such possibilities. Let X = {x1 , x2 , , . . . , xN }
be a finite set equipped with the measure:
Z N
X
f (x) dν(x) := νi f (xi ) , νi ≥ 0 (17)
X i=1
Axioms 2015, 4 6
A first observation has to be made concerning the existence of a family of N density matrices ρ(xi )
acting on Cn , i.e., Hermitian n×n-matrices with unit trace, which resolve the identity w.r.t. this measure:
N
X
νi ρ(xi ) = I (18)
i=1
Taking the trace of each side of this equation yields the constraint on the set of weights νi :
N
X
νi = n (19)
i=1
To simplify, we suppose that νi > 0 for all i. In particular, if the measure is uniform, νi = ν
for all i, then ν = n/N . Another point concerns the cardinal N of X versus the dimension n of H.
In its full generality, which means in the n-rank case, each n × n density matrix ρ(xi ) is defined by
n − 1 + n(n − 1)/2 × 2 = n2 − 1 real parameters. Moreover, in the present case, these N density
matrices are requested to satisfy the set of equations issued from Equation (18):
N
X
νi ρ(xi )ab = δab , 1≤a≤b≤n (20)
i=1
Due to Equation (19), they are not independent and represent n2 − 1 real constraints. Moreover,
these constraints have to be supplemented by the (non-trivial) condition that, for all i, ρ(xi ) is a positive
semi-definite matrix. This entails that we are left with a maximum of N n2 −N −n2 +1 = (N −1)(n2 −1)
free parameters. Hence, as soon as n ≥ 2, free parameters exist as soon as N ≥ 2. Let us examine the
minimal non-trivial case N = n = 2. Equation (18) assumes the 2 × 2 matrix form:
! ! !
a b a′ b′ 1 0
ν ¯ + (2 − ν) ¯′ = , 0≤ν≤2 (21)
b 1−a b 1 − a′ 0 1
This linear relation between two positive matrices implies that they are simultaneous diagonalizable,
with respective eigenvalues 0 < λ , 1 − λ < 1, 0 < λ′ = (1 − νλ)/(2 − ν) , 1 − λ′ < 1, with
normalized eigenvectors |e1 i, |e2 i, forming an orthonormal basis of C2 . Hence, Equation (21) is just a
trivial rewriting of the resolution of the identity in C2 :
(νλ + (2 − ν)λ′ ) |e1 ihe1 | + (ν1 − λ′ + (2 − ν)(1 − λ′ ) |e2 ihe2 | = |e1 ihe1 | + |e2 ihe2 | = I (22)
A second observation is that if all ρ(xi ) are rank one, i.e., ρ(xi ) = |xi ihxi |, hxi |xi i = 1,
then Equation (18) reads:
XN
νi |xi ihxi | = I (23)
i=1
√
which means that the set { νi |xi i} is a Parseval frame [13–16]. Such an identity is possible if N ≥ n;
and if N = n, then νi = 1 for all i, and {|xi i} is an orthonormal basis.
Suppose that a family pij = pxi (xj ) = pxj (xi ) of N probability distributions is defined on the measure
space (X, ν), i.e., a set of N (N + 1)/2 non-negative numbers pij = pji obeying:
N
X
νj pij = 1 , i = 1, 2, . . . , N (24)
j=1
Axioms 2015, 4 7
Therefore, we are left with N (N + 1)/2 − N = N (N − 1)/2 free parameters. Inspired by
Equation (8), we attempt to determine a set of N density matrices ρ(xi ) from the following identities:
tr (ρ(xi ) ρ(xj )) = pij = pxi (xj ) = pxj (xi ) (25)
Now, Equation (25) leads to the set of N + N (N − 1)/2 = N (N + 1)/2 real quadratic equations:
X X
pij = ρ(xi )aa ρ(xj )aa + 2Re ρ(xi )ab ρ(xj )ab (26)
1≤a≤n 1≤a<b≤n
Actually, these are not independent, since, for each i, applying N
P
j=1 νj on each side gives one.
Therefore, N (N − 1)/2 of these equations are independent. It follows the necessary condition:
N (N − 1)/2 ≤ N n2 − N − n2 + 1 ⇔ N 2 − N (2n2 − 1) + 2n2 − 2 ≤ 0 (27)
for having nontrivial solutions, and uniqueness might hold with N 2 − N (2n2 − 1) + 2n2 − 2
= (N − 1)(N − 2n2 + 2) = 0. Hence, Condition Equation (27) defines the allowed range for N with
respect to n:
1 ≤ N ≤ 2n2 − 2 (28)
On the other hand, in the minimal case corresponding to rank-one density matrices ρ(xi ) = |xi ihxi |,
i.e., coherent states, the probabilities are given by:
pij = tr(ρ(xi )ρ(xj )) = |hxi |xj i|2 := cos2 (θij ) (29)
Hence, these probabilities must obey the N constraints pii = 1 to be added to the N ones
Equation (24). This means that we are left with N (N − 1)/2 − N = N (N − 3)/2 free parameters.
Let us now express the resolution of the identity Equation (23). In terms of the respective coordinates ξli
of vectors |xi i with respect to an orthonormal basis {|el i} in Cn .
N n
" N # N
X X X X
νi |xi ihxi | = νi ξli ξl i |el ihel | = I ⇔
′ ′ νi ξli ξl′ i = δll′ (30)
i=1 l,l′ =1 i=1 i=1
Now, each projector ρ(xi ) = |xi ihxi | is defined a priori by 2n − 2 real coordinates (one constraint is
for normalization, tr(|xi ihxi |) = hxi |xi i = 1, the other one being for an arbitrary phase). There are N
such projectors, so there are 2N (n − 1) real parameters. From Equation (30) the latter are submitted to:
• n − 1 independent real constraints issued from the diagonal l = l′ ,
• n(n − 1) real independent constraints issued from the off-diagonals l 6= l′ .
Hence, like in Equation (27), we obtain the necessary condition:
N (N − 1)/2 − N ≤ 2N (n − 1) − n2 + 1 ⇔ N 2 − N (4n − 1) + 2n2 − n ≤ 0 (31)
for having nontrivial solutions, and the uniqueness (up to n phases) might hold with N 2 − N (4n − 1) +
2n2 − n = 0. This is possible for N in the range:
√ √
i
1h 2
1h 2
i
max n, 4n − 1 − 8n − 8n + 9 < N ≤ 4n − 1 − 8n − 8n + 9 (32)
2 2
Axioms 2015, 4 8
6. POVM from Coherent States
In this section we describe a simple method [17] for obtaining coherent states |xi, such that
ρ(x) = |xihx|. We start from another measure space (X, µ) and consider the Hilbert space L2 (X, µ)
of complex square integrable functions on X with respect to the measure µ. One then chooses in it an
orthonormal set O of functions φn (x) (set aside the question of the evaluation map in their respective
equivalence classes), satisfying the finiteness and positiveness conditions:
X
0 < N (x) := |φn (x)|2 < ∞ (a.e.) (33)
n
and in one-to-one correspondence with the elements of an orthonormal basis {|en i} of the Hilbert
space H:
|en i ↔ φn (34)
There results a family C of unit vectors |xi, the coherent states, in H, which are labeled by elements
of X and which resolve the identity operator in H with respect to the measure:
dν(x) = N (x) dµ(x) (35)
1 X
X ∋ x 7→ |xi = p φn (x)|en i ∈ H (36)
N (x) n
Z Z
hx|xi = 1 , |xihx| N (x) dµ(x) = |xihx| dν(x) = I (37)
X X
This certainly represents the most straightforward way to build total families of states resolving
the identity in H. Underlying the construction, there is a Bayesian content [18], based or not on
experimental evidences or on selective information choice, namely, an interplay between the set of
probability distributions:
Z
2
x 7→ |φn (x)| from |φn (x)|2 dµ(x) = 1 (38)
X
labeled by n, on the classical measure space (X, µ), and the discrete set of probability distributions:
X
n 7→ |φn (x)|2 /N (x) from N (x) = |φn (x)|2 (39)
n
In this CS case, the probability distribution:
px0 (x) = |hx0 |xi|2 = |K(x0 , x)|2 (40)
is expressed in terms of the reproducing kernel K w.r.t. the measure dν(x):
1 X
K(x, x′ ) = hx|x′ i = p φn (x)φn (x′ ) (41)
N (x) N (x′ ) n
Axioms 2015, 4 9
7. POVM Integral Quantization
With the above material at hand, the integral quantization of complex-valued functions f (x) ∈ C(X)
is formally defined as the linear map:
Z
f 7→ Af = f (x) ρ(x) dν(x) (42)
X
This map is properly defined if the operator Af ∈ A(H) is understood as the sesquilinear form:
Z
Bf (ψ1 , ψ2 ) = f (x) hψ1 |ρ(x)|ψ2 i dν(x) (43)
X
defined on a dense subspace of H. If f is real and at least semi-bounded and since ρ (x) is positive, the
Friedrichs extension [19] of Bf univocally defines a self-adjoint operator. If f is not semi-bounded, there
is no natural choice of a self-adjoint operator associated with Bf . In this last case, in order to construct
Af as an observable, we need to know more about the space of states H in order to examine the existence
of self-adjoint extensions (e.g., boundary conditions in the case of domains defined for wave functions).
Note that the above quantization may be extended to objects that are more general than functions.
We think of course of distributions if the relevant structure of X allows one to properly define them.
Suppose that the measure set (X, ν) is also a smooth manifold of dimension n, on which is defined the
space D′ (X) of distributions as the topological dual of the (LF)-space Ωnc (X) of compactly supported
n-forms on X [20]. Here, “LF” is for “inductive limit of sequence of Frechet spaces”. Some of these
distributions, e.g., δ(u(x)) or χ∆ (x) u(x), where χ∆ (x) is the characteristic function of ∆ ⊂ X, express
geometrical constraints. Extending the map Equation (42) yields the quantum version Aδ◦u or Aχ∆ u of
these constraints.
A different starting point for quantizing constraints, more in Dirac’s spirit [21], would consist of
quantizing the function u 7→ Au and determining the kernel of the operator Au . Both methods are
obviously not equivalent, except for a few cases. This question of equivalence/difference gives rise
to controversial opinions in fields like quantum gravity or quantum cosmology. Elementary examples
illustrating this difference are worked out in [9].
8. Semi-Classical Aspects and Quantum Measurements through Lower Symbols
We arrive at the point where the probability distribution Equation (8) makes sense in regard to the
objects f (functions or more singular entities) to be quantized. Indeed, some of the properties (if not all)
of the operator Af can be grasped by examining the function fˇ(x) defined as:
Af 7→ fˇ(x) := tr(ρ(x) Af ) (44)
and named, within the context of Berezin quantization [22,23], lower (Lieb) or covariant (Berezin)
symbols. Now, this quantity represents the local averaging of the original f with respect to the probability
distribution Equation (8):
Z Z
ˇ
f (x) 7→ f (x) = ′ ′ ′
f (x ) tr(ρ(x)ρ(x )) dν(x ) = f (x′ ) px (x′ ) dν(x′ ) (45)
X X
Axioms 2015, 4 10
This construction is a generalization of the so-called Bargmann–Segal transform (see, for
instance, [24,25]). It can also be viewed as a kind of Wigner function [5] endowed with a real
probabilistic content. In addition to the functional properties of the lower symbol fˇ, one may
investigate certain quantum features, such as, e.g., spectral properties of Af . Furthermore, the map
Equation (45) represents in general a regularization of the original, possibly extremely singular, f .
Another point deserves to be mentioned here. It concerns the analogy of the present formalism with
quantum measurement. In a quantum physics context for which Af is a self-adjoint operator or
P
observable of a system and given a density operator ρm = i qi |φi ihφi | describing the mixed state
of an ensemble, such that each of the pure states |φi i occurs with probability qi , the expectation value of
the measurement is given by the “unsharp” representation:
Z
tr (ρm Af ) = f (x) tr(ρm ρ(x)) dν(x) (46)
X
Hence, it can be also viewed as the average of the original f with respect to the probability density:
pm (x) := tr(ρm ρ(x)) (47)
Of course, this ρm can be one element ρm = ρ(x0 ) of the family of density operators from which is
issued the considered quantization. Inspired by ideas developed during the two last decades by various
authors, particularly Busch, Grabowski and Lahti in “Operational Quantum Physics” [26] and Holevo
in “Probabilistic and Statistical Aspects of Quantum Theory” [27], we turn our attention to the classical
“smeared” form, such as described in these books. If one validates the assumption that any quantum
observable is issued from our POVM quantization procedure, then its measurement can be expressed
as in Equation (46). This should shed new classical light on the quantum perspective, since the usual
integral representation of tr (ρm Af ), namely:
Z
tr (ρm Af ) = λ tr(ρm dEf (λ)) (48)
R
is issued from the spectral decomposition:
Z
Af = λ dEf (λ) (49)
R
of the self-adjoint Af with spectral measure dEf (λ) and is interpreted as a “sharp” measurement. In
this regard, Equation (46) might be viewed as an unsharp measurement, possibly through some marginal
integration [26,27].
We point out the “circular” nature of our procedure. On the one hand, we use POVM to quantize
classical functions. On the other hand, we obtain a POVM quantum measurement, interpreted as an
inverse transform yielding a “semi-classical object”, which, in the statistical inference context, yields an
inferred probability distribution. In that sense, we treat quantization and measurement as two aspects of
the same construct.
Of course, the projector-valued (PV) spectral measure Ef corresponding to the integral representation
Equation (49) might have a remote connection with the classical spectrum {f (x) , x ∈ X} appearing
in the integral representation Equation (42). While the POVM used for quantization, and built from
Axioms 2015, 4 11
a family ρ(x) resolving the identity with respect to the fixed measure ν, should be considered as a
frame to analyze functions on X, the PV measure in Equation (49) is proper to the quantum observable
and to functions of it. However, there are simple examples (consider the quantum versions of position
and momentum obtained from coherent state quantization) where classical and quantum spectra can be
considered as identical regardless of the difference between their respective PV and POVM. Moreover,
the frame x 7→ ρ(x) itself may be associated with a specific system to be quantized. A nice pedagogical
example (the sea star) is presented in chapter 11 of [6].
9. Covariant POVM Quantizations
In explicit constructions of density operator families and related POVM quantization, the theory of
Lie group representations offers a wide range of possibilities. Let G be a Lie group with left Haar
measure dµ(g), and let g 7→ U (g) be a unitary irreducible representation (UIR) of G in a Hilbert space
H. Pick a density operator ρ on H and let us transport it under the action of the representation operators
U (g). Its orbit is the family of density operators:
ρ (g) := U (g) ρ U † (g) , ρ (e) = ρ (50)
Suppose that the operator: Z
R := ρ (g) dµ (g) (51)
G
is defined in a weak sense. From the left invariance of dµ(g) we have:
Z
†
U (g0 ) R U (g0 ) = ρ (g0 g) dµ (g) = R (52)
G
and so R commutes with all operators U (g), g ∈ G. Thus, from Schur’s lemma, R = cρ I, with:
Z
cρ = tr (ρ0 ρ (g)) dµ (g) (53)
G
where the density operator ρ0 is chosen in order to make the integral converge. This family of operators
provides the following resolution of the identity:
dµ (g)
Z
ρ (g) dν (g) = I, dν (g) := (54)
G cρ
Let us examine in more detail the above procedure in the case of square integrable UIRs (e.g., affine
group, see below). For a square-integrable UIR U for which |ηi is an admissible unit vector, i.e.,
Z
c(η) := dµ(g) | hη| U (g) |ηi |2 < ∞ (55)
G
the resolution of the identity is obeyed by the family of coherent states for the group G:
|ηg i hηg | = ρ (g) , ρ := |ηi hη| , |ηg i = U (g) |ηi (56)
This property is easily extended to square-integrable UIR U for which ρ is an “admissible” density
operator, c(η) = G dµ(g) |tr(ρ U (g))|2 < ∞. The resolution of the identity then is obeyed by the
R
family: ρ(g) = U (g)ρU † (g).
Axioms 2015, 4 12
This allows an integral quantization of complex-valued functions on the group:
Z
f 7→ Af = ρ(g) f (g)dν(g) (57)
G
which is covariant in the sense that:
U (g)Af U † (g) = AUr (g)f (58)
In the case when f ∈ L2 (G, dµ(g)), the quantity (Ur (g)f )(g ′ ) := f (g −1 g ′ ) is the regular
representation. From the lower symbol, we obtain a generalization of the Berezin or heat kernel
transform on G: Z
ˇ
f (g) := tr(ρ(g) ρ(g ′ )) f (g ′ )dν(g ′ ) (59)
G
In the absence of square-integrability over G, there exists a definition of square-integrable covariant
coherent states with respect to a left coset manifold X = G/H, with H a closed subgroup of G, equipped
with a quasi-invariant measure ν [6].
10. The Example of the Unit Circle
We start our series of examples with one of the most elementary ones. Actually, it is rich both in
fundamental aspects and pedagogical resources. The measure set is the unit circle equipped with its
uniform (Lebesgue) measure:
dθ
X = S1 , dν(x) = , θ ∈ [0, 2π) (60)
π
The Hilbert space is the Euclidean plane H = R2 . The group G is the group SO(2) of rotations in
the plane. As described at length in Appendix A.1, the most general form of a real density matrix can be
given, as a π-periodic matrix, in terms of the polar coordinates (r, φ) of a point in the unit disk:
!
1 r r
+ cos 2φ sin 2φ
ρr,φ = 2 r 2 1
2
r
= ρr,φ+π , 0 ≤ r ≤ 1 , 0 ≤ φ < π (61)
2
sin 2φ 2
− 2
cos 2φ
We notice that for r = 1, the density matrix is just the orthogonal projector on the unit vector |φi with
polar angle φ: !
cos2 φ cos φ sin φ
ρ1,φ = = |φihφ| = |φ + πihφ + π| (62)
cos φ sin φ sin2 φ
Due to the covariance property Equation (A16), we define the family of density operators:
ρr,φ (θ) = R (θ) ρr,φ R (−θ) = ρr,φ+θ , 0 ≤ θ < 2π (63)
where the rotation matrix R(θ) is defined by Equation(A10). This family resolves the identity:
Z 2π
dθ
ρr,φ (θ) =I (64)
0 π
It follows the S1 -labeled family of probability distributions on (S1 , dθ/π):
1
1 + r2 cos 2(θ − θ0 )
pθ0 (θ) = tr (ρr,φ (θ0 ) ρr,φ (θ)) = (65)
2
Axioms 2015, 4 13
Such an expression reminds us of the cardioid distribution (see [28], page 51). At r = 0, we get the
uniform probability on the circle, whereas at r = 1, we get the “pure state” probability distribution:
pθ0 (θ) = cos2 (θ − θ0 ) (66)
Hence, the parameter r can be thought of as the inverse of a “noise” temperature r ∝ 1/T . The
pseudo-distance on S1 associated with Equation (65) is given by:
1 + r2 cos 2(θ − θ′ )
δr2 (θ, θ′ ) = − ln (67)
1 + r2
which reduces at small θ − θ′ to:
2r
δr (θ, θ′ ) ≈ √ |θ − θ′ | (68)
1+r 2
On the other hand, the distance dHS defined by Equation (13) reads in the present case:
q √
dr;HS (θ, θ ) = tr(ρr,φ (θ) − ρr,φ (θ′ ))2 = 2r| sin(θ − θ′ )|
′
(69)
which reduces at small θ − θ′ to Equation (68) up to a constant factor.
The quantization of a function (or distribution) f (θ) on the circle based on Equation (64) leads to the
2 × 2 matrix operator:
Z 2π !
dθ hf i + 2r Cc (R−φ f ) r
2
C s (R −φ f )
f 7→ Af = f (θ)ρr,φ (θ) = r
(70)
0 π C
2 s
(R −φ f ) hf i − 2r Cc (R−φ f )
1
R 2π
where hf i := 2π 0
f (θ) dθ is the average of f on the unit circle and Rφ (f )(θ) := f (θ − φ). The
symbols Cc and Cs are for the cosine and sine doubled angle Fourier coefficients of f :
Z 2π Z 2π
dθ dθ
Cc (f ) = f (θ) cos 2θ , Cs (f ) = f (θ) sin 2θ (71)
0 π 0 π
The simplest function to be quantized is the angle function (גθ), i.e., the 2π-periodic extension of
(גθ) = θ for θ ∈ [0, 2π), !
π + 2r sin 2φ − 2r cos 2φ
A= ג (72)
− 2r cos 2φ π − 2r sin 2φ
r πE
Its eigenvalues are π ± with corresponding eigenvectors φ ∓ . Its lower symbol is given by the
2 4
smooth function:
ˇ(גθ) = π − r2 sin θ (73)
11. The Example of the Unit Two-Sphere
The measure set is the unit sphere equipped with its rotationally invariant measure:
sin θ dθ dφ
X = S2 , dν(x) = , θ ∈ [0, π] , φ ∈ [0, 2π) (74)
2π
The Hilbert space is now H = C2 . The group G is the group SU(2) of 2 × 2-unitary matrices with
determinant one. We give in Appendix A.2 the essential notations and relations with quaternions.
Axioms 2015, 4 14
The unit ball B in R3 parametrizes the set of 2 × 2 complex density matrices ρ. Indeed, given a
three-vector d~ ∈ R3 , such that kdk
~ ≤ 1, a general density matrix ρ can be written as:
1
ρ ≡ ρd~ = (1 − i d) (75)
2 ∽
We have used for convenience the quaternionic representation d~ ≡ (0, d) ∈ H of the vector d~ ∈ R3
∽
~ = 1, i.e., d~ ∈ S 2 (“Bloch sphere” in this context), with spherical
(see Appendix A.2 for details). If kdk
coordinates (θ, φ), then ρ is the pure state:
ρ = |θ, φi hθ, φ| (76)
Note that the above column vector has to be viewed as the spin j = 1/2 coherent state in the Hermitian
space C2 with orthonormal basis |j = 1/2, m = ±1/2i:
θ 1 1 θ iφ 1 1
|θ, φi = cos , + sin e , − (77)
2 2 2 2 2 2
Let us now transport the density matrix ρ by using the two-dimensional complex representation of
rotations in space, namely the matrix SU(2) representation. For ξ ∈ SU (2), one defines the family of
density matrices labeled by ξ:
1
ρd~(ξ) := ξρξ¯ = (1 − iξdξ) ¯ (78)
2 ∽
In order to get a one-to-one correspondence with the points of the two-sphere, we restrict the elements
of SU(2) to those corresponding to the rotation Rθ,φ , bringing the unit vector kˆ pointing to the North
Pole to the vector with spherical coordinates (θ, φ), as described in Equation (A23):
ρd~(θ, φ) := ξ (Rθ,φ ) ρd~ ξ¯ (Rθ,φ ) (79)
with:
θ θ
ξ (Rθ,φ ) = cos , sin uˆφ , uˆφ = (− sin φ, cos φ, 0) (80)
2 2
The value of the integral for ~r = (x, y, z):
!
sin θ dθ dφ 1 x + iy
Z
ρd~(θ, φ) = (81)
S2 2π x − iy 1
shows that the resolution of the unity is achieved with d~ = d k,
ˆ 0 ≤ d ≤ 1 only. Then, it is clear that:
!
1 1 + r cos θ r sin θ eiφ
ρdkˆ (θ, φ) = ρ~r = , d = k~rk ≡ r (82)
2 r sin θ e−iφ 1 − r cos θ
It is with this strong restriction and the simplified notation:
ρdkˆ (θ, φ) ≡ ρr (θ, φ) (83)
that we go forward to the next calculations with the resolution of the unity:
sin θ dθ dφ
Z
ρr (θ, φ) =I (84)
S2 2π
Axioms 2015, 4 15
Note that the resolution of the identity with the SU(2) transport of a generic density operator
Equation (75) is possible only if we integrate on the whole group, as was done in [9].
The S2 -labeled family of probability distributions on (S2 , sin θ dθ dφ/2π):
1
1 + r2 rˆ0 · rˆ
pθ0 ,φ0 (θ, φ) = tr (ρr (θ0 , φ0 ) ρr (θ, φ)) =
2
1
1 + r2 (cos θ0 cos θ + sin θ0 sin θ cos(φ0 − φ))
= (85)
2
At r = 0, we get the uniform probability on the sphere, whereas at r = 1, we get the probability
distribution corresponding to the spin 1/2 CS Equation (77):
pθ0 ,φ0 (θ, φ) = |hθ0 , φ0 |θ, φi|2 (86)
Like for the unit circle, the parameter r can be viewed as the inverse of a “noise” temperature r ∝ 1/T .
The pseudo-distance on S2 associated with Equation (85) is given by:
1 + r2 (cos θ cos θ′ + sin θ sin θ′ cos(φ − φ′ ))
δr2 ((θ, φ) , (θ′ , φ′ )) = − ln (87)
1 + r2
which reduces at small θ − θ′ and (φ′ − φ) to:
s
θ + θ′
r
δr ((θ, φ) , (θ′ , φ′ )) ≈ √ (θ − θ ′ )2 + (φ − φ′ ) 2 sin 2
(88)
1 + r2 2
The distance dHS reads:
p r 1
dr;HS (θ, θ′ ) = tr(ρr (θ, φ) − ρr (θ′ , φ′ ))2 = √ kˆr − rˆ′ k = √ k~r − ~r ′ k (89)
2 2
which is the usual distance on the sphere with radius r issued from the Euclidean one. The quantization
of a function (or distribution) f (θ, φ) on the sphere based on Equation (84) leads to the 2 × 2
matrix operator:
S2 S2
!
Z
sin θ dθ dφ hf i + r C c (f ) r C s (f )
f 7→ Af = f (θ, φ)ρr (θ, φ) =
S 2
∗ 2 (90)
S2 2π r Cs (f ) hf i − r CcS (f )
1 2 2
f (θ, φ) sin θ dθ dφ is the average of f on the unit sphere and CcS and CsS are
R
where hf i := 4π S2
Fourier coefficients of f on the sphere defined as:
1 1
Z Z
S2 S2
Cc (f ) = f (θ, φ) cos θ sin θ dθ dφ , Cs (f ) = f (θ, φ) eiφ sin2 θ dθ dφ (91)
4π S2 4π S2
Since the sphere is a phase space with canonical coordinates q ≡ φ, p ≡ cos θ and
dq dp = sin θ dθ dφ, the latter may be thought of as the simplest functions to be quantized. We find for
the quantization of q: !
1 −i 4r πr
Aq = π r
=π+ σ2 (92)
i4 1 4
Axioms 2015, 4 16
πr 1
Its eigenvalues are π ± with corresponding eigenvectors ±i . Its lower symbol is given by the
4
smooth function:
πr2
qˇ(θ, φ) = π − sin θ sin φ (93)
4
The quantization of p yields the diagonal matrix:
!
r 1 0 r
Ap = = σ3 (94)
3 0 −1 3
r
with immediate eigenvalues ± and lower symbol:
3
πr2
pˇ(θ, φ) = cos θ (95)
3
Finally, we note the commutation rule:
πr2
[Aq , Ap ] = i σ1 (96)
6
12. The Example of the Plane
The measure set is the Euclidean plane (or complex plane) equipped with its uniform
(Lebesgue) measure:
2 d2 z dq dp q + ip
X = R ∼ C, dν(x) = = , z= √ ∈C (97)
π 2π 2
The group G is the Weyl–Heisenberg group GWH = {(ς, z) , ς ∈ R , z ∈ C} with multiplication law:
(ς, z)(ς ′ , z ′ ) = (ς + ς ′ + Im(z z¯′ ), z + z ′ ) (98)
In this group context, the plane C is viewed as the coset X = GWH /C ∼ C, where C is the center in
the group C = {(ς, 0) , ς ∈ R}. Let H be a separable (complex) Hilbert space with orthonormal basis
e0 , e1 , . . . , en ≡ |en i, . . . . Let us suppose that the basis element |en i is a state for n excitations of an
harmonic system, e.g., a Fock number state |ni for the quantum electromagnetic field with single-mode
photons and for which X = R2 is the plane of quadratures. Given an elementary quantum energy, say
~ω, and a temperature T (e.g., a noise one, like in electronics), a Boltzmann–Planck T -dependent density
operator, i.e., thermal state [29], is introduced as:
~ω
X∞ n~ω
− −
ρT = 1 − e B k T e kB T |en ihen | (99)
n=0
We notice that at zero temperature this operator reduces to the projector on the first basis element
(“ground state” or “vacuum”):
ρ0 = |e0 ihe0 | (100)
On the other hand, at a high temperature or equivalently in the classical limit kB T ≫ ~ω, and from
a classical probability point of view, one notices that we have the Rice probability density function [29].
This Rice distribution is also obtained in an analogous fashion in a classical optics context (classical, but
Axioms 2015, 4 17
probabilistic), “a constant phasor plus a random phasor sum”, which one may take to be the classical
version of the quantum “oscillator with a coherent signal superimposed on thermal noise” (see the
classical probabilistic description in [30]).
Introducing lowering and raising operators a and a† :
√ √
a |en i = n|en−1 i , a|e0 i = 0 , a† |en i = n + 1|en+1 i (101)
which obeys the canonical commutation rule:
[a, a† ] = I (102)
We obtain the number operator, N = a† a, whose spectrum is N, with corresponding eigenvectors as
the basis elements, N |en i = n|en i. Having in hand these two operators, we build a unitary irreducible
representation of the Weyl–Heisenberg group through the map:
† −¯
GWH /C ∼ C ∋ z 7→ D(z) = eza za
, D(−z) = (D(z))−1 = D(z)† (103)
and the composition law:
1 ¯′ ′ ¯′ ′
D(z)D(z ′ ) = e 2 (zz −¯zz ) D(z + z ′ ) = e(zz −¯zz ) D(z ′ )D(z) (104)
which show that the map z 7→ D(z) is a projective unitary representation of the abelian group C. Then,
one easily derives from the Schur lemma or directly that the family of displaced operators:
ρT (z) := D(z)ρT D(z)†
" #
~ω
−
X X
n
= (1 − t) t Dnm (z) Dm′ n (−z) |em ihem′ | , t=e kB T (105)
m,m′ n
where the matrix elements Dmn (z) of the operator D(z) are given in terms of associated Laguerre
(α)
polynomials Ln (t) [29],
r
∗ n! −|z|2 /2 m−n (m−n) 2
hem |D(z)|en i := Dmn (z) = (Dnm (−z)) = e z Ln (|z| ) , for m ≥ n (106)
m!
(m−n) m! (n−m)
with Ln (t) = n!
(−t)n−m Lm (t) for n ≥ m. With these properties, Equation (105) reads more
explicitly as: " #
X X
ρT (z) = ρT + (1 − t) tn Dnm (z) Dm′ n (−z) |em ihem′ |
m6=m′ n
The resolution of the identity follows from the results given in Section 9:
d2 z
Z
ρT (z) =I (107)
C π
More general constructions and results are given in [5]. At zero temperature, we recover the standard
(Schödinger, Klauder, Glauber, Sudarshan) coherent states [31]:
ρ0 (z) := |zihz| , |zi = D(z)|e0 i (108)
Axioms 2015, 4 18
Let us evaluate the probability distribution pz0 ;T (z) issued from ρT (z). The expression of pz0 ;T (z) is
rather elaborate:
2
pz0 ;T (z) = tr (ρT (z0 )ρT (z)) = (1 − t)2 e−|z−z0 | ×
" #
X 2 X ′ n ′
′ 2
t2n L(0) 2
tn+n ′ |z − z0 |2(n −n) L(n −n)
(|z − z0 |2 )
n (|z − z0 | ) +2 n (109)
n n′ >n
n
The first term in the sum can be given a compact form [32] (warning: there are errors in Poisson
generating function for Laguerre polynomials; the correct formula is found in WikiLaguerre):
2t2
−|z−z0 |2
2t|z − z0 |2
X
2n
2 e 1−t2
t L(0)
n (|z
2
− z0 | ) = I0 (110)
n
(1 − t2 ) 1 − t2
where I0 is a modified Bessel function. At z = z0 , Equation (109) reduces to:
1−t
pz0 ;T (z0 ) = trρ2T (z0 ) = (111)
1+t
As expected, at zero temperature, this quantity is equal to one. It vanishes at infinite temperature. The
pseudo-distance Equation (11) takes the form:
δ(z0 , z) = |z − z0 | + nT (|z − z0 |) (112)
where the T -dependent nT goes to zero as T → 0. It is only in the limit CS case that this quantity
acquires its true Euclidean distance meaning. As for dHS , we get:
s 2
′
q √ 1−t
2 2 ′
dT ;HS (z, z ) = tr(ρT (z) − ρT (z )) = 2 − pz;T (z ′ ) (113)
1+t
The quantization map based on ρT (z) is given by:
d2 z
Z
f 7→ Af = ρT (z) f (z) (114)
C π
There are translational and rotational covariances. Covariance w.r.t. complex translations reads as:
Af (z−z0 ) = D(z0 )Af (z) D(z0 )† (115)
To show rotational covariance, we define in the preamble the unitary representation θ 7→ UT (θ) of the
torus S1 on the Hilbert space H as the diagonal operator:
UT (θ)|en i = ei(n+ν)θ |en i (116)
where ν is arbitrary real. Then, from the matrix elements of D(z), one proves easily the rotational
covariance property:
UT (θ)D(z)UT (θ)† = D eiθ z
(117)
From the diagonal nature of ρT , we derive the covariance of Af w.r.t. complex rotations in the plane,
UT (θ)Af UT (−θ) = A̺(θ)f (118)
Axioms 2015, 4 19
where ̺(θ)f (z) := f e−iθ z . In particular, for the parity operator defined by:
∞
X
P= (−1)n |en ihen | (119)
n=0
We have:
Af (−z) = PAf (z) P, ∀ f (120)
A covariance also holds for the conjugation operator:
Af (z) = A†f (z) , ∀ f (121)
The canonical commutation rule is a T -independent outcome of the above quantization:
Az = a , Az¯ = a† (122)
√
Equivalently, with z = (q + ip)/ 2:
1 1
Aq = √ a + a+ ≡ Q , a − a+ ≡ P
Ap = √ (123)
2 2i
From this, their commutator is canonical:
Aq Ap − Ap Aq = i a, a+ = iI
(124)
We now turn our attention to the simple quadratic expressions:
s s
Aq2 = Q2 − , Ap2 = P 2 − (125)
2 2
~ω
where s := − coth . It follows that:
2kB T
1−s
A|z|2 = a† a + (126)
2
where |z|2 is the energy (in appropriate units) for the harmonic oscillator. The difference between
the ground state energy E0 = (1 − s)/2 and the minimum of the quantum potential energy
Em = [min(Aq2 ) + min(Ap2 )]/2 = −s/2 is independent of the temperature, namely E0 − Em = 1/2
(experimentally verified in 1925). It has been proven in [11] (at least in the CS case) that these constant
shifts in energy are inaccessible to measurement.
√
We now turn our attention to the quantization of the angle or phase. We write z = J eiγ in
action-angle (J, γ) notations for the harmonic oscillator. The quantization of a function f (J, γ) of the
action J ∈ R+ and of the angle γ = arg(z) ∈ [0, 2π), which is 2π-periodic in γ, yields formally
the operator: Z +∞ Z 2π
dγ √
iγ
Af = dJ f (J, γ)ρT Je (127)
0 0 2π
The angular covariance property takes the form:
UT (θ)Af UT (−θ) = AT (θ)f , T (θ)f (J, γ) := f (J, γ − θ) (128)
Axioms 2015, 4 20
In particular, let us quantize the discontinuous 2π-periodic angle function (גγ) = γ for γ ∈ [0, 2π).
Since this angle function is real and bounded, its quantum counterpart A גis a bounded self-adjoint
operator, and it is covariant according to Equation (128). In the basis |en i, it is given by the
infinite matrix: X 1
A = גπ 1 H + i Fmm′ (t) ′ |em ihem′ | (129)
m6=m′
m − m
where:
m+m′
Γ +1 m′ − m m + m′
m′ −m
2
Fmm′ (t) = (1 − t) √ (1 − t) 2
2 F1 −m, ;− ;t (130)
m!m′ ! 2 2
is symmetric w.r.t. the permutation of m and m′ (from the well-known 2 F1 (a, b; c; x) =
(1 − x)c−a−b 2 F1 (c − a, c − b; c; x)).
This operator has a spectral measure with support [0, 2π]. For a detailed study of such an operator in
the CS case (T = 0 = t), see [5].
13. The Example of the Half-Plane
The measure set is the half plane equipped with its uniform (Lebesgue) measure:
X = R+
∗ × R ≡ Π+ , dν(x) = dq dp , q ∈ (0, +∞) , p∈R (131)
Together with the multiplication (q, p)(q0 , p0 ) = (qq0 , p0 /q + p), q ∈ R∗+ , p ∈ R, Π+ is viewed as the
affine group Aff+ (R) of the real line. Aff+ (R) has two non-equivalent UIRs [33,34]. Both are square
integrable, and this is the rationale behind continuous wavelet analysis (see the references in [6]). The
UIR U+ ≡ U is realized in the Hilbert space H = L2 (R∗+ , dx):
√
U (q, p)ψ(x) = (eipx / q)ψ(x/q) (132)
In the same Boltzmann–Planck line, as for the plane, we build the temperature-dependent
density operator:
∞ ~ω
−
X
ρT = (1 − t) tn |en ihen | , t = e kB T (133)
n=0
where {|en i | n ∈ N} is an orthonormal basis of H. Let us choose that one which is built from
Laguerre polynomials:
s Z ∞
n! − x2 α (α)
en ↔ en (x) = e x Ln (x) ,
2 en (x) en′ (x) dx = δnn′ (134)
Γ(n + α + 1) 0
where α > −1 is a free parameter. Then, from [32], the operator ρT acts on H = L2 (R∗+ , dx) as the
integral transform: Z ∞
ρT : ψ(x) 7→ ρT (ψ)(x) = KT (x, y) ψ(y) dy (135)
0
where the integral kernel is given by:
√
−α/2 − 21 1+t
(x+y) txy
KT (x, y) = t e 1−t Iα 2 (136)
1−t
Axioms 2015, 4 21
Again, one derives from the Schur lemma that the transported operators:
ρT (q, p) := U (q, p)ρT U (q, p)† (137)
resolve the identity:
dq dp
Z
ρT (q, p) =I (138)
Π+ cρ
where the constant cρ is obtained from the integral through standard calculations in wavelet theory:
Z Z
cρ = he0 |ρT (q, p)|e0 i dq dp = |he0 |U (q, p)|e0 i|2 dq dp
Π+ Π+
Z ∞
dx 2π
= 2π (e0 (x))2 = (139)
0 x α
The resolution of the identity imposes the painless restriction α > 0 and reads finally:
dq dp
Z
α ρT (q, p) =I (140)
Π+ 2π
We leave the main results of the corresponding quantization to a future publication.
14. Conclusions
(1) About POVM Formalism(s)
In this first part of the Conclusions, we would like to comment with a few words about the relation
between formalisms based on POVM, regardless of whether it is used in a quantization context as it is
here, in quantum measurement or in statistical inference. Full developments will be the subject for a
separate paper.
The inference process may briefly be described as follows. We start from a context that includes a
source of data, which would be modeled by the use of a family of probability distributions indexed by
some parameter(s) of theoretical interest lying in the space (X, ν) for the system being studied. Thus, we
postulate a context that includes an experiment (actual or virtual), which requires probabilistic modeling,
a so-called random experiment. The probability model refers to possible results before performing the
experiment. One might conceive of the elements of X as those of primary interest upon which inference
will be performed by virtue of a related secondary device, which serves as a source of data. After
observing the results, one has data in hand with which to obtain an inferred probability distribution over
a σ-field of sets in X, which devolve from a POV measurement.
A property to note is that the probability distribution modeling the random experiment has a so-called
frequency interpretation that one can conceive of by (hypothetically) repeating the experiment in order to
generate a “population” or “ensemble” consisting of realized results, but not for the POV-related inferred
probability distribution.
In the inference situation, if one can obtain a probability model for the random experiment using a
PV measurement and coherent states, or their generalization described in this paper, then their resolution
of the identity property along with previously observed data provides us with an inferred probability
distribution on the parameter(s) of interest. Examples are given in [35,36]. Furthermore, note the role
played by covariant measurement as explicated in Holevo [27] and Busch, Grabowski and Lahti [26].
Axioms 2015, 4 22
The POV measure is generally conceived of as an attribute of quantum physics in contrast to classical
physics. In other words, the POV measure is considered to be a generalization of the PV measure, which
it is, of course, mathematically, and which is necessary for quantum theory.
However, here, we see that POV measurement occurs also as an attribute of statistical inference when
one has a probabilistic model, whether classical or quantum.
Note that, alternatively, if one has a deterministic model for the physics, none of the above applies.
One has a direct route, provided by the theory, from data to parameter(s) of interest. No need to make
much of a distinction. Let us take a simple example from medicine and biology. Small amounts of
dopamine obtained from brain tissue may be measured by preparing a fluorescent derivative. In order to
connect the fluorescent measurement with the amount of dopamine, one can run “standards”. This is not
a problem, as long as there is a deterministic connection between the two.
The problem of inference comes up when we have probabilistic modeling rather than deterministic
modeling. In that sense, one may say it is quantum rather than classical; except, as we all know, there
are classical contexts in which we also need to use a probabilistic model, in which case, the inferred
distribution would also involve POV measurements.
(2) About Measurement(s)
Confusion might arise because the word “measurement” is used in more than one way.
Consider a system under study which is to be described in terms of a “mathematical model”.
This would include “observables”, which are associated with certain properties of the system that we
“measure”. However, rarely can we measure them directly. Ordinarily, we actually measure (in the
conventional sense) some related secondary system for which we can obtain experimental data. Then,
the problem arises of how to relate the experimental data to the observables of the system of interest.
Further, we conceive of experiments (actual or virtual) as being repeatable, if even only hypothetically
(thus enters relativity and covariant measurements).
As mentioned above, if the relationship is conceived of as being deterministic (which is usually
associated with classical physics), there is usually no problem. Now, what if the model is probabilistic?
Then, we have two probability models. One refers to the actual experiment in relation to some secondary
system related to the one in which we are interested. That would be a family of probability distributions
indexed by parameter(s) that describe the “unknown” property of interest. (Think of tossing a coin to
estimate by experiment: the property p of the coin where p = chance of getting heads on one toss. It
is this property in which we are interested but cannot “measure” directly. Is it a “fair” coin or not.
With a probabilistic model, we cannot get a definite answer. What we can get is the odds: inferred
probability distribution on the space X = [0, 1], in this case, from data that we get by tossing the coin.
The probability model for the experiment is the binomial distribution with “unknown” parameter p. The
inferred probability distribution for p is the beta distribution, where the observed proportion of heads
occurs in the formula.) The point is that the probabilities related to the experiment (secondary system)
have a so-called “frequency interpretation”, meaning that we can generate an “ensemble” via repeats
of the experiment. However, the inferred probabilities do not. There is no experiment directly related
to them. We have an axiomatic definition of probability, but no frequency interpretation for it. That
Axioms 2015, 4 23
is the nature of probabilistic inference. That is why we have POVM in regard to inferred probability
distributions and PV measures for possible experimental results of so-called “random experiments”.
Now consider theoretical models of physical systems for which there is no direct experimental
background. Then, if it is classical and also probabilistic or if it is quantum and, so, necessarily
probabilistic, then the consequent probabilities have no frequency interpretation and are derived from
POVMs. In modeling, sometimes, it is the state that describes the system, and then, an observable gives
us the probability model via expectation. In our paper, it is the POVM quantization that describes the
system and, at the same time, gives a probability model via expectation.
(3) Various Uses for POVMs
In summary, we have designated three views of POVM quantization formulas. One relates to the
process of quantization itself, another to theoretical modeling and another to inference.
These various uses for POVMs are inter-related, so that it is not always appropriate to separate them.
In the paper, we see that these inter-relationships are revealed and discussed. Roughly speaking, we
have quantization discussed in Sections 2, 5, 7 and 8, theoretical modeling in Sections 3, 4, 5, 8 and 9,
inference in Sections 4, 6, 8 and 9, and these Conclusions, along with examples. Understandably, there
is much overlap between theoretical modeling and inference.
Do POVMs have a fundamental role in quantum theory? Yes. How do POVMs arise? They are used
to describe a quantum system probabilistically and also in performing statistical inference. As usual,
quantization is important in cases where there is a classical analogy. Examples are given.
Acknowledgments
Jean Pierre Gazeau thanks the CNPq for financial support; the World Academy of Sciences and
the International Centre for Theoretical Physics (TWAS-ICTP), Trieste, and the Centro Brasileiro de
Pesquisas Fìsicas—CBPF, Rio de Janeiro, for hospitality and support.
Author Contributions
The respective contributions of Jean Pierre Gazeau and Barbara Heller to the content of this article
are equal in importance.
Appendix
A. 1. Parametrizations of 2 × 2 Real Density Matrices
There are various expressions for a density matrix acting on the Euclidean plane, i.e., a 2 × 2 real
positive matrix with trace equal to one. The most immediate one is the following with parameters a
and b: !
a b
ρ := M(a, b) = , 0 ≤ a ≤ 1 , ∆ := det ρ = a(1 − a) − b2 ≥ 0 (A1)
b 1−a
Axioms 2015, 4 24
The above inequalities imply the following ones:
1 1 1 1
0 ≤ a(1 − a) ≤ , 0≤∆≤ , − ≤b≤ (A2)
4 4 2 2
Let:
1 1 √
≤ λ = (1 + 1 − 4∆) ≤ 1 (A3)
2 2
be the highest eigenvalue of ρ (the lowest one is 0 ≤ 1 − λ ≤ 1/2). The spectral decomposition of ρ
reads as: πED π
ρ = λ|φihφ| + (1 − λ) φ + φ+ (A4)
2 2
where: !
cos φ π π
|φi ≡ , − ≤φ≤ (A5)
sin φ 2 2
is the corresponding unit eigenvector, chosen as pointing in the right half-plane. We could have also
chosen the opposite |φ + πi = −|φi pointing in the left half-plane, since |φ + πihφ + π| = |φihφ|. Our
choice corresponds to the most immediate one in terms of the orthonormal basis of the plane issued from
the canonical one {|0i , |π/2i} through the rotation by φ.
Let us make explicit the decomposition Equation (A4):
!
λ − 21 cos(2φ) + 21 λ − 12 sin(2φ)
ρ= (A6)
λ − 21 sin(2φ) 1
cos(2φ) + 21
2
− λ
We derive from this expression the polar parametrization of the (a, b) parameters of ρ:
1 1 1
a− = λ− cos(2φ) , b = λ − sin(2φ) (A7)
2 2 2
In return, we have the angle φ ∈ [−π/2, π/2] as a function of a and b:
(
1 b
2
arctan a−1/2 , − π4 ≤ φ ≤ π4
φ= 1 b
(A8)
2
arctan a−1/2 + π4 , |φ| ≥ π4
In this way, each ρ is univocally (but not biunivocally) determined by a point in the unit disk, with
polar coordinates (r := 2λ − 1, Φ := 2φ), 0 ≤ r ≤ 1, −π ≤ Φ < π.
Furthermore, note the alternative expression issued from Equation (A6):
1 1
ρ ≡ R(r, Φ) = (I + r R(Φ)σ3 ) = (I + (2λ − 1) R(φ)σ3 R(−φ)) (A9)
2 2
where R(Φ) is the rotation matrix in the plane:
!
cos Φ − sin Φ
R(Φ) = (A10)
sin Φ cos Φ
and σ3 is the diagonal Pauli matrix: !
1 0
σ3 = (A11)
0 −1
Axioms 2015, 4 25
Note the important property used to get the second equality in Equation (A9):
!
cos Φ sin Φ
R(Φ)σ3 = = σ3 R(−Φ) (A12)
sin Φ − cos Φ
Therefore, the expression of a matrix density to which we refer most often throughout the paper reads:
!
1 r r
+ cos Φ sin Φ
ρ ≡ R(r, Φ) = 2 r 2 1
2
(A13)
2
sin Φ 2
− 2r cos Φ
From Equations (A9) and (A12), we derive the interesting multiplication formula:
rr′
′ ′ ′ 1 ′ ′ ′ I
ρρ = R(r, Φ)R(r , Φ ) = R(r, Φ) + R(r , Φ ) + R(Φ − Φ ) − (A14)
2 2 2
and the resulting (non-closed) “algebra” of real density matrices,
[ρ, ρ′ ] = −irr′ sin(Φ − Φ′ )σ2 , {ρ, ρ′ } = ρ + ρ′ + (cos(Φ − Φ′ ) − 1/2)I (A15)
A.1. 1. Covariance
The expression Equation (A13) is convenient to examine the way a density matrix transforms under
a rotation R(ω) in the plane. We have:
1
ρ ≡ R(r, Φ) 7→ R(ω)R(r, Φ)R(−ω) = (I + (2λ − 1) R(φ + ω)σ3 R(−φ − ω))
2
= R(r, Φ + 2ω) ≡ ρ(ω) (A16)
A.1.2. Integrals of the Density Matrix
The computation of the three following (whose two are partial or marginal) integrals is
straightforward:
1 2π
Z
R(r, θ) dθ = I (A17)
π 0
1 2π 1 2π
Z Z
ρ(ω) dω = R(r, θ + 2ω) dω = I (A18)
π 0 π 0
Z 1
1 1
R(r, θ) r dr = R(1, θ) + I (A19)
0 3 12
2
Z
R(r, θ) dS = I (A20)
π D
where D is the unit disk and dS = r dr dθ.
Axioms 2015, 4 26
A.2. SU(2) as Unit Quaternions Acting in R3
A.2.1. Rotations and Quaternions
A convenient representation is possible thanks to quaternion calculus. We recall that the quaternion
field as a multiplicative group is H ≃ R+ ×SU(2). The correspondence between the canonical basis of
H ≃ R4 , (1 ≡ e0 , e1 , e2 , e3 ), and the Pauli matrices is ea ↔ (−1)a+1 iσa , with a = 1, 2, 3. Hence, the
2 × 2 matrix representation of these basis elements is the following:
! ! ! !
1 0 0 i 0 −1 i 0
↔ e0 , ↔ e1 ≡ ˆı , ↔ e2 ≡ ˆ , ↔ e3 ≡ kˆ
0 1 i 0 1 0 0 −i
Any quaternion decomposes as q = (q0 , ~q) (resp. q a ea , a = 0, 1, 2, 3) in scalar-vector notation (resp.
in Euclidean metric notation). We also recall that the multiplication law explicitly reads in scalar-vector
notation: qq ′ = (q0 q0′ − ~q · q~′ , q0′ ~q + q0 q~′ + ~q × q~′ ). The (quaternionic) conjugate of q = (q0 , ~q) is
q¯ = (q0 , −~q), the squared norm is kqk2 = q q¯ and the inverse of a nonzero quaternion is q −1 = q¯/kqk2 .
Unit quaternions, i.e., quaternions with norm one, the multiplicative subgroup isomorphic to SU(2),
constitute the three-sphere S 3 .
On the other hand, any proper rotation in space is determined by a unit vector n ˆ defining the rotation
axis and a rotation angle 0 ≤ ω < 2π about the axis, as is shown in Figure 1.
❈❖ n
❈ˆ ✯
✟
✟
❈ ✟✟
❈ ✟
ω
❳
②
❳❳❳ ❈ ✟✟
❳❳❳ ❈ ✟✟
✟
❳❳
❳ r′
r ❳❈✟
O
Figure 1. Rotation in space as determined by the unit vector n
ˆ of the rotation axis and the
rotation angle 0 ≤ ω < 2π about the axis.
The action of such a rotation, R(ω, n
ˆ ), on a vector ~r is given by:
def
r~′ = R(ω, n
ˆ ) · ~r = ~r · n
ˆn ˆ × (~r × n
ˆ + cos ω n n × ~r)
ˆ ) + sin ω (ˆ (A21)
The latter is expressed in scalar-vector quaternionic form as:
(0, r~′ ) = ξ(0, ~r)ξ¯
where: ω ω
ξ := cos , sin nˆ ∈ SU(2)
2 2
Axioms 2015, 4 27
or, in matrix form,
!
ξ0 + iξ3 −ξ2 + iξ1
ξ=
ξ2 + iξ1 ξ0 − iξ3
!
cos ω2 + in3 sin ω2 (−n2 + in1 ) sin ω2
= (A22)
(n2 + in1 ) sin ω2 cos ω2 − in3 sin ω2
in which case, quaternionic conjugation corresponds to the transposed conjugate of the
corresponding matrix.
In particular, for a given unit vector:
def
n
ˆ = (sin θ cos φ, sin θ sin φ, cos θ) = (θ, φ)
0 ≤ θ ≤ π, 0 ≤ φ < 2π
one considers the specific rotation Rnˆ that maps the unit vector pointing to the North Pole, kˆ = (0, 0, 1),
to n
ˆ , as shown in Figure 2.
def
ˆ ) = 0, R(θnˆ , uˆφnˆ )kˆ ≡ ξnˆ 0, kˆ ξ¯nˆ , uˆφnˆ = (− sin φnˆ , cos φnˆ , 0)
(0, n (A23)
with:
θnˆ θnˆ
ξnˆ = cos , sin uˆφnˆ (A24)
2 2
✻
kˆ
❃
✚
✚ rˆ
✚
θ ✚ ✚
✚✘ ✘✘✿
✘
✚
✘ ✘
uˆ φ
O
Figure 2. Rotation Rrˆ mapping the unit vector pointing to the North Pole, kˆ = (0, 0, 1), to rˆ.
Conflicts of Interest
The authors declare no conflict of interest.
References
1. Somaraju, R.A.; Sarlette, A.; Thienpont, H. Quantum filtering using POVM measurements. In
Proceedings of 2013 IEEE 52nd Annual Conference on Decision and Control (CDC), Florence,
Italy, 10–13 December 2013.
Axioms 2015, 4 28
2. Bouten, L.; van Handel, R.; James, M.R. An introduction to quantum filtering. SIAM J. Control
Optim. 2008, 46, 2199–2241.
3. Barndorff-Nielsen, O.E.; Gill, R.D.; Jupp, P.E. On Quantum Statistical Inference. J. R. Stat. Soc.
Ser. B Stat. Methodol. 2003, 65, 775–804.
4. Kuperberg, G. A Concise Introduction to Quantum Probability, Quantum Mechanics, and
Quantum Computation. Available online: http://www.math.ucdavis.edu//intro-2005.pdf (accessed
on 18 December 2014).
5. Bergeron, H.; Gazeau, J.P. Integral quantizations with two basic examples. Ann. Phys. 2014, 344,
43–68.
6. Ali, S.T.; Antoine, J.-P.; Gazeau, J.P. Coherent States, Wavelets and their Generalizations, 2nd ed.;
Springer: New York, NY, USA, 2013; Chapter 11.
7. Bergeron, H.; Curado, E.M.F.; Gazeau, J.P.; Rodrigues, Ligia M.C.S. Quantizations from
(P)OVM’s. In Proceedings of the 8th Symposium on Quantum Theory and Symmetries, El
Colegio Nacional, Mexico City, Mexico, 5–9 August 2013.
8. Bergeron, H.; Dapor, A.; Gazeau, J.P.; Małkiewicz, P. Smooth big bounce from affine
quantization. Phys. Rev. D 2014, 89, doi:10.1103/PhysRevD.89.083522.
9. Baldiotti, M.; Fresneda, R.; Gazeau, J.P. Three Examples of Covariant Integral Quantization.
In Proceedings of 3rd International Satellite Conference on Mathematical Methods in
Physics—ICMP 2013, Londrina, Brazil, 21–26 October 2013.
10. Ali, S.T.; Engliš, M. Quantization methods: A guide for physicists and analysts. Rev. Math. Phys.
2005, 17, doi:10.1142/S0129055X05002376.
11. Bergeron, H.; Gazeau, J.P.; Youssef, A. Are the Weyl and coherent state descriptions physically
equivalent? Phys. Lett. A 2013, 377, 598–605.
12. Baldiotti, M.; Fresneda, R.; Gazeau, J.P. About Dirac & Dirac constraint quantizations.
Phys. Scr. 2014, submitted.
13. Benedetto, J.J.; Fickus, M. Finite normalized tight frames. Adv. Comput. Math. 2003, 18,
357–385.
14. Han, D.; Kornelson, K.; Weber, E. Frames for Undergraduates. Student Mathematical Library;
American Mathematical Society: Providence, RI, USA, 2007; Volume 40.
15. Cotfas, N.; Gazeau, J.P. Finite tight frames and some applications (topical review). J. Phys. A
Math. Theor. 2010, 43, doi:10.1088/1751-8113/43/19/193001.
16. Cotfas, N.; Gazeau, J.P.; Vourdas, A. Finite-dimensional Hilbert space and frame quantization.
J. Phys. A Math. Gen. 2011, 44, doi:10.1088/1751-8113/44/17/175303 .
17. Gazeau, J.P. Coherent States in Quantum Physics; Wiley-VCH: Berlin, Germany, 2009.
18. Ali, S.T.; Gazeau, J.P.; Heller, B. Coherent states and Bayesian duality. J. Phys. A Math. Theor.
2008, 41, doi:10.1088/1751-8113/41/36/365302.
19. Reed, M.; Simon, B. Methods of Modern Mathematical Physics, II. Fourier Analysis,
Self-Adjointness; Academic Press: New York, NY, USA, 1975; Volume 2.
20. Grosser, M. A note on distribution spaces on manifolds. Novi Sad J. Math. 2008, 38, 121–128.
21. Dirac, P.A.M. Lectures on Quantum Mechanics; Dover: New York, NY, USA, 2001.
22. Berezin, F.A. Quantization. Math. USSR Izvestija 1974, 8, 1109–1165.
Axioms 2015, 4 29
23. Berezin, F.A. General concept of quantization. Commun. Math. Phys. 1975, 40, 153–174.
24. Stenzel, M.B. The Segal-Bargmann transform on a symmetric space of compact type. J. Funct.
Anal. 1994, 165, 44–58.
25. Hall, B.C. The Segal-Bargmann “Coherent State” transform for compact Lie groups. J. Funct.
Anal. 1994, 122, 103–151.
26. Busch, P.; Grabowski, M.; Lahti, P.J. Operational Quantum Physics; Springer-Verlag: Berlin,
Germany, 1995.
27. Holevo, A.S. Probabilistic and Statistical Aspects of Quantum Theory; Springerg: Berlin,
Germany, 2011.
28. Mardia, K.V. Statistics of Directional Data; Academic Press: New York, NY, USA, 1972.
29. Helstrom, C.W. Quantum Detection and Estimation Theory; Academic Press: New York, NY,
USA, 1976.
30. Goodman, J.W. Statistical Optics; Wiley Classics Library: New York, NY, USA, 2000.
31. Klauder, J.R.; Sudarshan, E.C.G. Fundamentals of Quantum Optics; Benjamin: New York, NY,
USA, 1968.
32. Magnus, W.; Oberhettinger, F.; Soni, R.P. Formulas and Theorems for the Special Functions of
Mathematical Physics; Springer-Verlag: Berlin, Germany, 1966.
33. Gel’fand, I.M.; N’aimark, M.A. Unitary representations of the group of linear transformations of
the straight line. Dokl. Akad. Nauk SSSR 1947, 55, 567–570.
34. Aslaksen, E.W.; Klauder, J.R. Unitary Representations of the Affine Group, J. Math. Phys. 1968,
15, 206–211.
35. Heller, B.; Wang, M. Group invariant inferred distributions via non-commutative probability. Inst.
Math. Stat. Lect. Notes Monogr. Ser. 2006, 50, 1–19.
36. Heller, B.; Wang, M. Posterior distribution for negative binomial parameter p using a group
invariant prior. Stat. Probab. Lett. 2007, 77, 1542–1548.
c 2014 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article
distributed under the terms and conditions of the Creative Commons Attribution license
(http://creativecommons.org/licenses/by/4.0/).