Quantum Tomography under Prior Information

Teiko Heinosaari, Luca Mazzarella, Michael M. Wolf

Introduction

Quantum tomography aims at identifying states of a quantum system. In order to achieve this aim, measurement data is often supplemented by prior information. In this work we consider cases where prior information effectively reduces the dimensionality, i.e., the number of parameters which are necessary to characterize the state of a system. Physically, one may think of scenarios of interferometry, process tomography or parameter estimation, where one prepares the initial state which then evolves depending on a certain number of unknown parameters before one measures the final system. Effective reductions of the number of parameters can also be due to a constraining symmetry or fixed energy or particle number.

We are interested in determining the minimal number of measurement outcomes or measurement settings which are required for identifying a state taken from such a reduced set. Clearly, if states in the considered set are parameterized by a certain number of independent real parameters, then we need at least this number of measurement outcomes or binary measurements in order to pinpoint the state. As an example take the manifold of pure states in a dd-dimensional Hilbert space. Their description requires 2d22d-2 real parameters, as opposed to d21d^{2}-1 real parameters needed to describe an arbitrary density matrix. So if we want to determine a pure state by a single measurement with mm outcomes, m2dm\sim 2d seems necessary and md2m\sim d^{2} appears to be achievable. The question about the smallest possible mm in this particular example has been addressed in a number of publications , but the answer has remained somewhat elusive, so far.

A related problem has been addressed based on compressed sensing ideas, where it has been shown that for d×dd\times d density matrices of rank rr, m=O(drlog(d)2)m=O(dr\log(d)^{2}) binary measurements are sufficient in order to identify the state with high probability. In this light we emphasize that our focus lies on schemes which identify the unknown state unambiguously and deterministically. In this work we disregard questions regarding the robustness of the schemes, the complexity of the necessary post-processing of the measurement data or the ability to verify the assumed prior information.

Outline. Typically, when we want to know the minimal number of measurement outcomes related to some subset of states, we need to independently consider upper and lower bounds. In Section 3 we concentrate on upper bounds, while in Section 4 we provide some methods to obtain lower bounds. The upper bounds are of geometric or algebraic nature and often come with concrete constructions. The lower bounds are mainly topological in nature. They are based on the observation that any measurement which is informationally complete when supplemented by prior information is a mapping into the space of measurement outcomes which preserves topological invariants. In the case of pure states the upper and lower bounds essentially match which enables us for instance to show that for a minimal informational complete measurement m4dm\sim 4d up to an additive logarithmic correction. This exemplifies our general finding that, loosely speaking, on the one hand topological obstruction can force mm to be twice the dimension of a considered manifold, whereas on the other hand geometric reasoning allows us to show that such an overhead is essentially always sufficient.

Notation. In this paper H\mathcal{H} is a fixed finite dimensional complex Hilbert space. We denote by L(H)\mathcal{L(H)} the set of linear operators on H\mathcal{H} and we will write L\left\|L\right\| for the operator norm which coincides with the largest singular value of LL(H)L\in\mathcal{L(H)}. A positive operator ϱL(H)\varrho\in\mathcal{L(H)} having trace one is a density operator, also referred to as state, and we denote by S(H)\mathcal{S(H)} the set of all states on H\mathcal{H}.

Quantum tomography prerequisites

In this section we will introduce and analyze the linear algebra framework for quantum tomography. Later, in Section 4, we will then add a topological perspective.

The goal of quantum tomography is to identify an unknown quantum state from the statistics of measurement outcomes. We will consider tomographic schemes where a fixed (as opposed to an adaptive) measurement setting is chosen, and we will refrain from considering errors for instance caused by finite statistics. In particular, in the following “ measurement ”always refers to a full statistical experiment rather than to a single-shot experiment.

Our analysis deals with several different albeit closely related approaches: 1) single measurements with many outcomes; 2) several measurements with possibly fewer outcomes; 3) several measurements where only expectation values are considered.

For a POVM A\mathsf{A} and a state (density matrix) ϱ\varrho, we denote by ϱA\varrho^{\mathsf{A}} the corresponding probability distribution of measurement outcomes. It is given by the formula

Let PS(H)\mathcal{P}\subseteq\mathcal{S(H)} be a subset of density matrices. A POVM A\mathsf{A} is called informationally complete w.r.t. P\mathcal{P} if ϱϱA\varrho\mapsto\varrho^{\mathsf{A}} is injective on P\mathcal{P}, i.e.,

In a more general scheme one may perform several measurements and use all of their measurement outcome statistics. Let {A,B,}\{\mathsf{A},\mathsf{B},\ldots\} be a finite collection of POVMs. We denote by S(A,B,)\mathfrak{S}(\mathsf{A},\mathsf{B},\ldots) the linear span of the set

where nn is the number of measurement outcomes of A\mathsf{A}, mm is the number of measurement outcomes of B\mathsf{B}, and so forth. In particular,

Every operator XSX\in\mathfrak{S} can be written as a linear combination of two selfadjoint operators,

which belong to S\mathfrak{S}. If XSX\in\mathfrak{S} is selfadjoint, then XX can be written as a linear combination of two positive operators,

For any POVM B\mathsf{B}, the generated operator system S(B)\mathfrak{S}(\mathsf{B}) cannot have more linearly independent elements than B\mathsf{B} has outcomes. This implies the last claim. ∎

We denote by S\mathfrak{S}^{\perp} the orthogonal complement of an operator system S\mathfrak{S} with respect to the Hilbert-Schmidt inner product. In particular, for a POVM A={Aj}j=1n\mathsf{A}=\{\mathsf{A}_{j}\}_{j=1}^{n} we have

The complement S(A)\mathfrak{S}(\mathsf{A})^{\perp} of S(A)\mathfrak{S}(\mathsf{A}) is related to the informational completeness of A\mathsf{A} in the following simple way.

Let SL(H)\mathfrak{S}\subseteq\mathcal{L(H)} be an operator system and let PS(H)\mathcal{P}\subseteq\mathcal{S(H)} be a set of states. Then a POVM A\mathsf{A} satisfying S(A)=S\mathfrak{S}(\mathsf{A})=\mathfrak{S} is informationally complete w.r.t. P\mathcal{P} iff

In other words, for any pair of states ϱ1,ϱ2P\varrho_{1},\varrho_{2}\in\mathcal{P} the following are equivalent:

ϱ1A=ϱ2A\varrho_{1}^{\mathsf{A}}=\varrho_{2}^{\mathsf{A}},

ϱ1ϱ2S(A)\varrho_{1}-\varrho_{2}\in\mathfrak{S}(\mathsf{A})^{\perp}.

For all states ϱ1,ϱ2P\varrho_{1},\varrho_{2}\in\mathcal{P}, we have:

Let PS(H)\mathcal{P}\subseteq\mathcal{S(H)} be a subset of states. A collection {S1,S2,,Sn}\{S_{1},S_{2},\ldots,S_{n}\} of selfadjoint operators is called informationally complete w.r.t. P\mathcal{P} if

for all ϱ1,ϱ2P\varrho_{1},\varrho_{2}\in\mathcal{P}.

We are interested in questions of the type: what is the minimal number of

needed to have an informationally complete measurement scheme w.r.t. to a given premise PS(H)\mathcal{P}\subseteq\mathcal{S(H)}?

Let PS(H)\mathcal{P}\subseteq\mathcal{S(H)}. The following are equivalent.

There exists a POVM A\mathsf{A} with nn outcomes and A\mathsf{A} is informationally complete w.r.t. P\mathcal{P}.

There exists a set {S1,S2,,Sn1}\{S_{1},S_{2},\ldots,S_{n-1}\} of n1n-1 selfadjoint operators and {S1,S2,,Sn1}\{S_{1},S_{2},\ldots,S_{n-1}\} is informationally complete w.r.t. P\mathcal{P}.

As a consequence of Proposition 3 we only need to calculate the minimal number of either selfadjoint operators or POVM elements.

(Minimal informationally complete measurements) Let PS(H)\mathcal{P}\subseteq\mathcal{S(H)} be any subset of states. We denote by m[P]\mathfrak{m}\left[\mathcal{P}\right]

- the minimal number of selfadjoint operators which are informationally complete w.r.t. P\mathcal{P}.

- the minimal number of positive operators determining a POVM which is informationally complete w.r.t. P\mathcal{P} (i.e., the number of POVM elements subtracted by one).

The remaining part of this work is devoted to deriving upper and lower bounds on m[P]\mathfrak{m}\left[\mathcal{P}\right] either for various concrete subsets P\mathcal{P} or depending on specific properties of P\mathcal{P} – in particular its dimensionality and topology.

Upper bounds

As the first class of examples, we consider the subset Pr:={ϱS(H):rank(ϱ)r}\mathcal{P}_{r}:=\{\varrho\in\mathcal{S(H)}:{\rm rank}(\varrho)\leq r\} of all states whose rank is bounded by a number r<d/2r<d/2 where d=dimHd=\dim{\mathcal{H}}. How many measurement outcomes suffice in order to identify a state taken from Pr\mathcal{P}_{r}?

If 1r<d/21\leq r<d/2, then there exists a POVM A\mathsf{A} which is informationally complete w.r.t. Pr\mathcal{P}_{r} and has 4r(dr)4r(d-r) outcomes. Therefore,

We will construct a subspace B\mathcal{B} of d×dd\times d matrices with the properties that

The main part of our construction follows . The following fact will be needed. Let MM be a totally nonsingular m×mm\times m-matrix with positive entries. (Recall that a matrix is called totally nonsingular if all of its minors are nonzero.) For instance, a Vandermonde matrix of the form

We will now define a set of d×dd\times d matrices that span the sought subspace B\mathcal{B}. By the kkth diagonal of a matrix [Mij][M_{ij}] we mean the elements MijM_{ij} with ij=dki-j=d-k. (In other words, we label the diagonals from the lower left corner upwards. The main diagonal is then the ddth diagonal.)

For each kk satisfying 2r+1kd12r+1\leq k\leq d-1, we build up k2rk-2r matrices as follows. We choose k2rk-2r columns from a totally nonsingular k×kk\times k -matrix and we put them to the kkth diagonal and ’s elsewhere, hence obtaining k2rk-2r linearly independent matrices. Any matrix PP which is a linear combination of these k2rk-2r matrices has at least 2r+12r+1 nonzero elements on the kkth diagonal. Since all the matrix elements not in the kkth diagonal are , we see that the PP has rank at least 2r+12r+1.

In total, we have built up d2r+2k=2r+1d1(k2r)=(d2r)2d-2r+2\sum_{k=2r+1}^{d-1}(k-2r)=(d-2r)^{2} linearly independent matrices. The previously mentioned subspace B\mathcal{B} is the linear span of these matrices. The properties (a)-(c) are immediate consequences of the construction. To verify (d), suppose that BBB\in\mathcal{B}. Let kBk_{B} be the largest kk such that the kkth diagonal of BB contains nonzero elements. Then the kBk_{B}th diagonal contains actually 2r+12r+1 nonzero elements. The (2r+1)×(2r+1)(2r+1)\times(2r+1) submatrix having those 2r+12r+1 nonzero elements in its main diagonal is lower triangular, therefore has nonzero determinant. It follows that the rank of BB is at least 2r+12r+1. ∎

Let us remark that a recently introduced method based on compressed sensing uses O(drlog(d)2)O(dr\log(d)^{2}) outcomes to identify a rank-rr state with high probability . The number of outcomes given in Theorem 1 therefore beats the compressed sensing approach. The latter, however, might be advantageous regarding the simplicity of the classical post-processing, robustness and verifiability of the assumption.

2. Pure states

We will now have a closer look at the set P1\mathcal{P}_{1} of pure states. For this case, Theorem 1 implies that m[P1]4d5\mathfrak{m}\left[\mathcal{P}_{1}\right]\leq 4d-5. We will first provide some simple arguments showing that indeed m[P1]=4d5\mathfrak{m}\left[\mathcal{P}_{1}\right]=4d-5 for d=2,3d=2,3. In general, we will see later, based on topological reasoning, that the leading order 4d4d is the best possible.

We remark that Flammia et al. proved that a POVM which is informationally complete w.r.t. the set of pure states has at least 2d2d outcomes . They also constructed a POVM with 2d2d elements capable of distinguishing almost all (but not all) pairs of pure states. A similar construction was given by Finkelstein .

For a POVM A\mathsf{A}, the following conditions are equivalent:

A\mathsf{A} is not informationally complete w.r.t. the set of pure states.

From Proposition 4 we conclude the following simple characterization.

It is an immediate consequence of Corollary 1 that in the qubit case (i.e. dimH=2\dim\mathcal{H}=2) a POVM A\mathsf{A} is informationally complete w.r.t. pure states if and only if S(A)={0}\mathfrak{S}(\mathsf{A})^{\perp}=\{0\}. Therefore, in the qubit case informational completeness for pure states implies informational completeness for all states. One can also easily see this by a direct inspection of the Bloch sphere.

contains rank-2 matrices but no rank-2 selfadjoint matrix.

As an application of Corollary 1, we can easily characterize all POVMs in dimension 33 which are informationally complete w.r.t. the set of pure states.

S(A)={0}\mathfrak{S}(\mathsf{A})^{\perp}=\{0\} (i.e. A\mathsf{A} is informationally complete w.r.t. all states).

In particular, a minimal POVM which is informationally complete w.r.t. pure states has 88 outcomes.

We conclude that if A\mathsf{A} is informationally complete for the set of all pure states, then either dimS(A)=8\dim\mathfrak{S}(\mathsf{A})=8 or dimS(A)=9\dim\mathfrak{S}(\mathsf{A})=9. In the latter case A\mathsf{A} is informationally complete for all states. If dimS(A)=8\dim\mathfrak{S}(\mathsf{A})=8, then dimS(A)=1\dim\mathfrak{S}(\mathsf{A})^{\perp}=1 and S(A)\mathfrak{S}(\mathsf{A})^{\perp} is thus generated by a single selfadjoint operator TT. From Cor. 1 follows that A\mathsf{A} is informationally complete w.r.t. all pure states exactly when TT is invertible. ∎

The characterization given in Proposition 5 helps to illustrate one possible drawback of POVMs which are merely informationally complete w.r.t. to a subset of states: even if a POVM A\mathsf{A} can distinguish any pair of distinct pure states, it may not be capable of distinguishing pure states from mixed states. In other words, we may not be able to verify the premise from the measurement outcome statistics.

We already noted that Theorem 1 leads to the general upper bound m[P1]4d5\mathfrak{m}\left[\mathcal{P}_{1}\right]\leq 4d-5 albeit without yielding an explicit set of operators. Now, we give another proof of this result via a simple direct construction. The construction is inspired by a topological embedding given by James in . Consider two types of matrices XαX_{\alpha} and YβY_{\beta} which we label by integers α=1,,2d2\alpha=1,\ldots,2d-2 and β=1,,2d3\beta=1,\ldots,2d-3 respectively. The XαX_{\alpha}’s are taken to be such that \big{(}X_{\alpha}\big{)}_{kl}=\delta_{k+l,\alpha+1}, i.e., there are 11’s along the α\alpha’th anti-diagonal and zeroes elsewhere. The YβY_{\beta}’s are similarly defined with nonzero entries solely along the anti-diagonals, in this case \big{(}Y_{\beta}\big{)}_{kl}=0 unless k+l=β+2k+l=\beta+2. The entries are chosen such that the matrices are anti-symmetric with entries ii below the diagonal. The constructed matrices thus have the following structure:

where we replaced xleiφylx_{l}\rightarrow e^{-i\varphi}y_{l} for all lml\leq m and exploited that CγC_{\gamma} is upper triangular and that xl=0x_{l}=0 for all l<nl<n. Together with the hypothesis in Eq.(3) and the assumption that \big{(}C_{\gamma}\big{)}_{n,m+1}\neq 0 this implies indeed that ym+1=eiφxm+1y_{m+1}=e^{i\varphi}x_{m+1}.

Note that Cγ=(Xγ1+iYγ2)/2C_{\gamma}=(X_{\gamma-1}+iY_{\gamma-2})/2 for γ=3,,2d1\gamma=3,\ldots,2d-1 and C2=X1/2C_{2}=X_{1}/2. Hence, if xSjx=ySjy{\langle}x|S_{j}|x{\rangle}={\langle}y|S_{j}|y{\rangle} for every jj, then xCγx=yCγy{\langle}x|C_{\gamma}|x{\rangle}={\langle}y|C_{\gamma}|y{\rangle} for every γ=2,,2d1\gamma=2,\ldots,2d-1. The remaining condition xC2dx=yC2dy{\langle}x|C_{2d}|x{\rangle}={\langle}y|C_{2d}|y{\rangle} holds due to x=y\left\|x\right\|=\left\|y\right\|. Therefore it follows from Eq.(3) that the set {Sj}\{S_{j}\} is informationally complete w.r.t. the set of all pure states. ∎

It is possible to slightly improve the upper bound m[P1]4d5\mathfrak{m}\left[\mathcal{P}_{1}\right]\leq 4d-5 at the cost of having to deal with a more complicated set of operators introduced in the work of Milgram :

For odd dd this upper bound can be worse than the previously derived 4d54d-5, but for most dimensions it is below 4d54d-5. Notice that α\alpha satisfies 1αlog2(d)1\leq\alpha\leq\log_{2}(d). We will see later that this improvement is, in fact, nearly optimal. With this refined upper bound we can calculate the exact value of m[P1]\mathfrak{m}\left[\mathcal{P}_{1}\right] for small dimensions dd in Sec. 4.2.

The construction of the operators is based on the work of Milgram . We will only argue why this corresponds to a proper measurement scheme rather than reproducing the rather cumbersome construction.

The set of m{m} selfadjoint matrices TjT_{j} therefore leads to a measurement scheme which is informationally complete w.r.t. the set of pure states. ∎

3. Generic bounds depending on fractal dimension

For instance, if P\mathcal{P} is a smooth manifold of real dimension d(P)d(\mathcal{P}), then D(P)=d(P)D(\mathcal{P})=d(\mathcal{P}).

Let PS(H)\mathcal{P}\subseteq\mathcal{S(H)} be a closed subset of the set of density matrices with Minkowski dimension D(P)D(\mathcal{P}). Then almost any (in the Lebesgue measure sense) collection of m>2D(P)m>2D(\mathcal{P}) selfadjoint operators is informationally complete w.r.t. P\mathcal{P}. In particular, m[P]2D(P)+1\mathfrak{m}\left[\mathcal{P}\right]\leq 2D(\mathcal{P})+1.

In principle, this bound can be refined to m>δPPm>\delta_{\mathcal{P}-\mathcal{P}}, where δPP\delta_{\mathcal{P}-\mathcal{P}} is the Hausdorff dimension of the set PP:={ϱ1ϱ2ϱiP}\mathcal{P}-\mathcal{P}:=\{\varrho_{1}-\varrho_{2}|\varrho_{i}\in\mathcal{P}\} . This bound is generally better since δPPD(PP)2D(P)\delta_{\mathcal{P}-\mathcal{P}}\leq D(\mathcal{P}-\mathcal{P})\leq 2D(\mathcal{P}), but it may be more difficult to handle. We also mention that for the inverse mappings Hölder continuity can be proven and the respective constants can be bounded .

Let us apply Theorem 4 to the case of pure states. This is a smooth manifold of real dimension 2d22d-2, hence D(P1)=2d2D(\mathcal{P}_{1})=2d-2. Thus, almost any collection of 4d34d-3 selfadjoint operators is informationally complete w.r.t. pure states.

Lower bounds

Let A\mathsf{A} be a POVM with m+1m+1 outcomes. This induces a map

If PS(H)\mathcal{P}\subset\mathcal{S(H)} is a closed surface without boundary, then m[P]3\mathfrak{m}\left[\mathcal{P}\right]\geq 3. Moreover, if P\mathcal{P} is non-orientable in addition, then m[P]4\mathfrak{m}\left[\mathcal{P}\right]\geq 4.

The first three matrices in (8) give rise to measurement results which form the Roman surface displayed in Fig.2. The failure of informational completeness (due to disregarding the necessary fourth measurement) is reflected by self-intersections of the surface.

2. Measurements as diffeomorphisms

Manifolds of interest in quantum tomography often have a differentiable structure – they are smooth manifolds. In such a case we may resort to differential topology, which imposes more restrictive conditions on the existence of smooth embeddings. Before we apply these conditions to the concrete cases of pure states and states with general rank constraints we provide some general background.

will play an important role in the following.

To show that hAh_{\mathsf{A}} is a smooth embedding we need to prove (i) that it is smooth, which follows from linearity, (ii) that it is a topological embedding, which follows from the assumed injectivity and Prop. 6 and (iii) that it has an injective derivative everywhere.

The inclusion Tp(P)Δ(P)T_{p}(\mathcal{P})\subseteq\Delta(\mathcal{P}) holds for all pPp\in\mathcal{P} if P\mathcal{P} is the complex Grassmannian manifold G(r,dr)G(r,d-r) understood as the submanifold in the space of d×dd\times d selfadjoint matrices which consists of all orthogonal projections of rank rr.

is an element of Tp(P)T_{p}(\mathcal{P}) and in fact, such derivatives span the entire tangent space

In order to see this we have to show that they span a vector space which has the same dimension as the manifold (for which D(P)=2r(dr)D(\mathcal{P})=2r(d-r)). To this end, note that there is a one-to-one relation between commutators and block off-diagonal matrices in the sense that we can always write

In a suitable basis any element XTP(P)X\in T_{P}(\mathcal{P}) is such that

since Eq.(11) allows us to work with the singular values {ci}\{c_{i}\} of CC by transforming X(UV)X(UV)X\mapsto(U\oplus V)X(U\oplus V)^{\dagger} with appropriate unitaries UU and VV. Setting λ:=maxici\lambda:=\max_{i}{c_{i}} equal to the operator norm of XX we complete the proof if we show that every 2×22\times 2 matrix of the form cσxc\sigma_{x} with cc\in is a difference of two projections. This can seen to be true by taking the difference of two pure qubit states whose Bloch vectors are parameterized by (c,±1c2,0)(c,\pm\sqrt{1-c^{2}},0). ∎

where α\alpha denotes the number of 11’s in the binary expansion of d1d-1 and D=2d2D=2d-2 is the real dimension of the manifold.

If we now combine this non-embedding result with the upper bounds discussed in Subsection 3.2, we obtain a fairly comprehensive picture on the minimal number of measurements required to identify an element of the set of pure states. The bounds for the minimal number for the dimensions 2102-10 are summarized in the table below.

For all dimensions dd we can say that the difference between the upper and lower bounds is at most log2(d)\log_{2}(d), and that the minimal number differs from the best affine upper bound 4d54d-5 by at most 2log2(d)2\log_{2}(d). The lower and upper bounds up to d=30d=30 are presented in Fig.4.

Summary

How many measurement outcomes (i.e., POVM elements) are minimally needed in order to identify all quantum states from a given set? We have shown on the one hand that if the set is a manifold, then topological obstructions can increase the number of required measurements by a factor of two over the dimensionality of the manifold. On the other hand we have seen that this factor of two is sufficient even in a more general context where the considered set is not necessarily a manifold and its dimensionality is understood as its Minkowski dimension.

We have considered two different types of examples: 2-manifolds, where (non-)orientability plays an important role, and Grassmannian manifolds, which contain the set of all pure states as a particular instance.

For the latter case we have shown that upper and lower bounds—both obtained via topological embeddings—essentially match. In fact, they never differ by more than log2(d)\log_{2}(d). To be more precise, their difference is strictly less than the number of ones appearing in the binary expansion of d1d-1.

Points which are beyond the present work, albeit obviously of practical relevance, are the inversion algorithm and issues of robustness and certifiability. For instance, we do not know whether in the pure state case m4dm\sim 4d can be achieved in a way such that (i) the inversion is algorithmically efficient and (ii) the validity of the assumption behind the prior information is certifiable.

Acknowledgements

We thank Claudio Carmeli for pointing out an error in the earlier version of Theorem 3.

MMW was supported by the EU-STREP projects COQUIT and QUEVADIS and the Alfried Krupp von Bohlen und Halbach-Stiftung. TH was supported by the Emil Aaltonen Foundation, Alfred Kordelin Foundation and Academy of Finland (grant no. 1381359). TH and MMW are both grateful for the hospitality and the inspiring working environment at the Institut Mittag-Leffler in Stockholm, where this work was partly done.

References