Quantum Tomography under Prior Information

Teiko Heinosaari, Luca Mazzarella, Michael M. Wolf

Introduction

Quantum tomography aims at identifying states of a quantum system. In order to achieve this aim, measurement data is often supplemented by prior information. In this work we consider cases where prior information effectively reduces the dimensionality, i.e., the number of parameters which are necessary to characterize the state of a system. Physically, one may think of scenarios of interferometry, process tomography or parameter estimation, where one prepares the initial state which then evolves depending on a certain number of unknown parameters before one measures the final system. Effective reductions of the number of parameters can also be due to a constraining symmetry or fixed energy or particle number.

We are interested in determining the minimal number of measurement outcomes or measurement settings which are required for identifying a state taken from such a reduced set. Clearly, if states in the considered set are parameterized by a certain number of independent real parameters, then we need at least this number of measurement outcomes or binary measurements in order to pinpoint the state. As an example take the manifold of pure states in a $d$ -dimensional Hilbert space. Their description requires $2d-2$ real parameters, as opposed to $d^{2}-1$ real parameters needed to describe an arbitrary density matrix. So if we want to determine a pure state by a single measurement with $m$ outcomes, $m\sim 2d$ seems necessary and $m\sim d^{2}$ appears to be achievable. The question about the smallest possible $m$ in this particular example has been addressed in a number of publications , but the answer has remained somewhat elusive, so far.

A related problem has been addressed based on compressed sensing ideas, where it has been shown that for $d\times d$ density matrices of rank $r$ , $m=O(dr\log(d)^{2})$ binary measurements are sufficient in order to identify the state with high probability. In this light we emphasize that our focus lies on schemes which identify the unknown state unambiguously and deterministically. In this work we disregard questions regarding the robustness of the schemes, the complexity of the necessary post-processing of the measurement data or the ability to verify the assumed prior information.

Outline. Typically, when we want to know the minimal number of measurement outcomes related to some subset of states, we need to independently consider upper and lower bounds. In Section 3 we concentrate on upper bounds, while in Section 4 we provide some methods to obtain lower bounds. The upper bounds are of geometric or algebraic nature and often come with concrete constructions. The lower bounds are mainly topological in nature. They are based on the observation that any measurement which is informationally complete when supplemented by prior information is a mapping into the space of measurement outcomes which preserves topological invariants. In the case of pure states the upper and lower bounds essentially match which enables us for instance to show that for a minimal informational complete measurement $m\sim 4d$ up to an additive logarithmic correction. This exemplifies our general finding that, loosely speaking, on the one hand topological obstruction can force $m$ to be twice the dimension of a considered manifold, whereas on the other hand geometric reasoning allows us to show that such an overhead is essentially always sufficient.

Notation. In this paper $\mathcal{H}$ is a fixed finite dimensional complex Hilbert space. We denote by $\mathcal{L(H)}$ the set of linear operators on $\mathcal{H}$ and we will write $\left\|L\right\|$ for the operator norm which coincides with the largest singular value of $L\in\mathcal{L(H)}$ . A positive operator $\varrho\in\mathcal{L(H)}$ having trace one is a density operator, also referred to as state, and we denote by $\mathcal{S(H)}$ the set of all states on $\mathcal{H}$ .

Quantum tomography prerequisites

In this section we will introduce and analyze the linear algebra framework for quantum tomography. Later, in Section 4, we will then add a topological perspective.

The goal of quantum tomography is to identify an unknown quantum state from the statistics of measurement outcomes. We will consider tomographic schemes where a fixed (as opposed to an adaptive) measurement setting is chosen, and we will refrain from considering errors for instance caused by finite statistics. In particular, in the following “ measurement ”always refers to a full statistical experiment rather than to a single-shot experiment.

Our analysis deals with several different albeit closely related approaches: 1) single measurements with many outcomes; 2) several measurements with possibly fewer outcomes; 3) several measurements where only expectation values are considered.

For a POVM $\mathsf{A}$ and a state (density matrix) $\varrho$ , we denote by $\varrho^{\mathsf{A}}$ the corresponding probability distribution of measurement outcomes. It is given by the formula

Let $\mathcal{P}\subseteq\mathcal{S(H)}$ be a subset of density matrices. A POVM $\mathsf{A}$ is called informationally complete w.r.t. $\mathcal{P}$ if $\varrho\mapsto\varrho^{\mathsf{A}}$ is injective on $\mathcal{P}$ , i.e.,

In a more general scheme one may perform several measurements and use all of their measurement outcome statistics. Let $\{\mathsf{A},\mathsf{B},\ldots\}$ be a finite collection of POVMs. We denote by $\mathfrak{S}(\mathsf{A},\mathsf{B},\ldots)$ the linear span of the set

where $n$ is the number of measurement outcomes of $\mathsf{A}$ , $m$ is the number of measurement outcomes of $\mathsf{B}$ , and so forth. In particular,

Every operator $X\in\mathfrak{S}$ can be written as a linear combination of two selfadjoint operators,

which belong to $\mathfrak{S}$ . If $X\in\mathfrak{S}$ is selfadjoint, then $X$ can be written as a linear combination of two positive operators,

For any POVM $\mathsf{B}$ , the generated operator system $\mathfrak{S}(\mathsf{B})$ cannot have more linearly independent elements than $\mathsf{B}$ has outcomes. This implies the last claim. ∎

We denote by $\mathfrak{S}^{\perp}$ the orthogonal complement of an operator system $\mathfrak{S}$ with respect to the Hilbert-Schmidt inner product. In particular, for a POVM $\mathsf{A}=\{\mathsf{A}_{j}\}_{j=1}^{n}$ we have

The complement $\mathfrak{S}(\mathsf{A})^{\perp}$ of $\mathfrak{S}(\mathsf{A})$ is related to the informational completeness of $\mathsf{A}$ in the following simple way.

Let $\mathfrak{S}\subseteq\mathcal{L(H)}$ be an operator system and let $\mathcal{P}\subseteq\mathcal{S(H)}$ be a set of states. Then a POVM $\mathsf{A}$ satisfying $\mathfrak{S}(\mathsf{A})=\mathfrak{S}$ is informationally complete w.r.t. $\mathcal{P}$ iff

In other words, for any pair of states $\varrho_{1},\varrho_{2}\in\mathcal{P}$ the following are equivalent:

$\varrho_{1}^{\mathsf{A}}=\varrho_{2}^{\mathsf{A}}$ ,

$\varrho_{1}-\varrho_{2}\in\mathfrak{S}(\mathsf{A})^{\perp}$ .

For all states $\varrho_{1},\varrho_{2}\in\mathcal{P}$ , we have:

Let $\mathcal{P}\subseteq\mathcal{S(H)}$ be a subset of states. A collection $\{S_{1},S_{2},\ldots,S_{n}\}$ of selfadjoint operators is called informationally complete w.r.t. $\mathcal{P}$ if

for all $\varrho_{1},\varrho_{2}\in\mathcal{P}$ .

We are interested in questions of the type: what is the minimal number of

needed to have an informationally complete measurement scheme w.r.t. to a given premise $\mathcal{P}\subseteq\mathcal{S(H)}$ ?

Let $\mathcal{P}\subseteq\mathcal{S(H)}$ . The following are equivalent.

There exists a POVM $\mathsf{A}$ with $n$ outcomes and $\mathsf{A}$ is informationally complete w.r.t. $\mathcal{P}$ .

There exists a set $\{S_{1},S_{2},\ldots,S_{n-1}\}$ of $n-1$ selfadjoint operators and $\{S_{1},S_{2},\ldots,S_{n-1}\}$ is informationally complete w.r.t. $\mathcal{P}$ .

As a consequence of Proposition 3 we only need to calculate the minimal number of either selfadjoint operators or POVM elements.

(Minimal informationally complete measurements) Let $\mathcal{P}\subseteq\mathcal{S(H)}$ be any subset of states. We denote by $\mathfrak{m}\left[\mathcal{P}\right]$

- the minimal number of selfadjoint operators which are informationally complete w.r.t. $\mathcal{P}$ .

- the minimal number of positive operators determining a POVM which is informationally complete w.r.t. $\mathcal{P}$ (i.e., the number of POVM elements subtracted by one).

The remaining part of this work is devoted to deriving upper and lower bounds on $\mathfrak{m}\left[\mathcal{P}\right]$ either for various concrete subsets $\mathcal{P}$ or depending on specific properties of $\mathcal{P}$ – in particular its dimensionality and topology.

Upper bounds

As the first class of examples, we consider the subset $\mathcal{P}_{r}:=\{\varrho\in\mathcal{S(H)}:{\rm rank}(\varrho)\leq r\}$ of all states whose rank is bounded by a number $r<d/2$ where $d=\dim{\mathcal{H}}$ . How many measurement outcomes suffice in order to identify a state taken from $\mathcal{P}_{r}$ ?

If $1\leq r<d/2$ , then there exists a POVM $\mathsf{A}$ which is informationally complete w.r.t. $\mathcal{P}_{r}$ and has $4r(d-r)$ outcomes. Therefore,

We will construct a subspace $\mathcal{B}$ of $d\times d$ matrices with the properties that

The main part of our construction follows . The following fact will be needed. Let $M$ be a totally nonsingular $m\times m$ -matrix with positive entries. (Recall that a matrix is called totally nonsingular if all of its minors are nonzero.) For instance, a Vandermonde matrix of the form

We will now define a set of $d\times d$ matrices that span the sought subspace $\mathcal{B}$ . By the $k$ th diagonal of a matrix $[M_{ij}]$ we mean the elements $M_{ij}$ with $i-j=d-k$ . (In other words, we label the diagonals from the lower left corner upwards. The main diagonal is then the $d$ th diagonal.)

For each $k$ satisfying $2r+1\leq k\leq d-1$ , we build up $k-2r$ matrices as follows. We choose $k-2r$ columns from a totally nonsingular $k\times k$ -matrix and we put them to the $k$ th diagonal and ’s elsewhere, hence obtaining $k-2r$ linearly independent matrices. Any matrix $P$ which is a linear combination of these $k-2r$ matrices has at least $2r+1$ nonzero elements on the $k$ th diagonal. Since all the matrix elements not in the $k$ th diagonal are , we see that the $P$ has rank at least $2r+1$ .

In total, we have built up $d-2r+2\sum_{k=2r+1}^{d-1}(k-2r)=(d-2r)^{2}$ linearly independent matrices. The previously mentioned subspace $\mathcal{B}$ is the linear span of these matrices. The properties (a)-(c) are immediate consequences of the construction. To verify (d), suppose that $B\in\mathcal{B}$ . Let $k_{B}$ be the largest $k$ such that the $k$ th diagonal of $B$ contains nonzero elements. Then the $k_{B}$ th diagonal contains actually $2r+1$ nonzero elements. The $(2r+1)\times(2r+1)$ submatrix having those $2r+1$ nonzero elements in its main diagonal is lower triangular, therefore has nonzero determinant. It follows that the rank of $B$ is at least $2r+1$ . ∎

Let us remark that a recently introduced method based on compressed sensing uses $O(dr\log(d)^{2})$ outcomes to identify a rank- $r$ state with high probability . The number of outcomes given in Theorem 1 therefore beats the compressed sensing approach. The latter, however, might be advantageous regarding the simplicity of the classical post-processing, robustness and verifiability of the assumption.

2. Pure states

We will now have a closer look at the set $\mathcal{P}_{1}$ of pure states. For this case, Theorem 1 implies that $\mathfrak{m}\left[\mathcal{P}_{1}\right]\leq 4d-5$ . We will first provide some simple arguments showing that indeed $\mathfrak{m}\left[\mathcal{P}_{1}\right]=4d-5$ for $d=2,3$ . In general, we will see later, based on topological reasoning, that the leading order $4d$ is the best possible.

We remark that Flammia et al. proved that a POVM which is informationally complete w.r.t. the set of pure states has at least $2d$ outcomes . They also constructed a POVM with $2d$ elements capable of distinguishing almost all (but not all) pairs of pure states. A similar construction was given by Finkelstein .

For a POVM $\mathsf{A}$ , the following conditions are equivalent:

$\mathsf{A}$ is not informationally complete w.r.t. the set of pure states.

From Proposition 4 we conclude the following simple characterization.

It is an immediate consequence of Corollary 1 that in the qubit case (i.e. $\dim\mathcal{H}=2$ ) a POVM $\mathsf{A}$ is informationally complete w.r.t. pure states if and only if $\mathfrak{S}(\mathsf{A})^{\perp}=\{0\}$ . Therefore, in the qubit case informational completeness for pure states implies informational completeness for all states. One can also easily see this by a direct inspection of the Bloch sphere.

contains rank-2 matrices but no rank-2 selfadjoint matrix.

As an application of Corollary 1, we can easily characterize all POVMs in dimension $3$ which are informationally complete w.r.t. the set of pure states.

$\mathfrak{S}(\mathsf{A})^{\perp}=\{0\}$ (i.e. $\mathsf{A}$ is informationally complete w.r.t. all states).

In particular, a minimal POVM which is informationally complete w.r.t. pure states has $8$ outcomes.

We conclude that if $\mathsf{A}$ is informationally complete for the set of all pure states, then either $\dim\mathfrak{S}(\mathsf{A})=8$ or $\dim\mathfrak{S}(\mathsf{A})=9$ . In the latter case $\mathsf{A}$ is informationally complete for all states. If $\dim\mathfrak{S}(\mathsf{A})=8$ , then $\dim\mathfrak{S}(\mathsf{A})^{\perp}=1$ and $\mathfrak{S}(\mathsf{A})^{\perp}$ is thus generated by a single selfadjoint operator $T$ . From Cor. 1 follows that $\mathsf{A}$ is informationally complete w.r.t. all pure states exactly when $T$ is invertible. ∎

The characterization given in Proposition 5 helps to illustrate one possible drawback of POVMs which are merely informationally complete w.r.t. to a subset of states: even if a POVM $\mathsf{A}$ can distinguish any pair of distinct pure states, it may not be capable of distinguishing pure states from mixed states. In other words, we may not be able to verify the premise from the measurement outcome statistics.

We already noted that Theorem 1 leads to the general upper bound $\mathfrak{m}\left[\mathcal{P}_{1}\right]\leq 4d-5$ albeit without yielding an explicit set of operators. Now, we give another proof of this result via a simple direct construction. The construction is inspired by a topological embedding given by James in . Consider two types of matrices $X_{\alpha}$ and $Y_{\beta}$ which we label by integers $\alpha=1,\ldots,2d-2$ and $\beta=1,\ldots,2d-3$ respectively. The $X_{\alpha}$ ’s are taken to be such that $\big{(}X_{\alpha}\big{)}_{kl}=\delta_{k+l,\alpha+1}$ , i.e., there are $1$ ’s along the $\alpha$ ’th anti-diagonal and zeroes elsewhere. The $Y_{\beta}$ ’s are similarly defined with nonzero entries solely along the anti-diagonals, in this case $\big{(}Y_{\beta}\big{)}_{kl}=0$ unless $k+l=\beta+2$ . The entries are chosen such that the matrices are anti-symmetric with entries $i$ below the diagonal. The constructed matrices thus have the following structure:

where we replaced $x_{l}\rightarrow e^{-i\varphi}y_{l}$ for all $l\leq m$ and exploited that $C_{\gamma}$ is upper triangular and that $x_{l}=0$ for all $l<n$ . Together with the hypothesis in Eq.(3) and the assumption that $\big{(}C_{\gamma}\big{)}_{n,m+1}\neq 0$ this implies indeed that $y_{m+1}=e^{i\varphi}x_{m+1}$ .

Note that $C_{\gamma}=(X_{\gamma-1}+iY_{\gamma-2})/2$ for $\gamma=3,\ldots,2d-1$ and $C_{2}=X_{1}/2$ . Hence, if ${\langle}x|S_{j}|x{\rangle}={\langle}y|S_{j}|y{\rangle}$ for every $j$ , then ${\langle}x|C_{\gamma}|x{\rangle}={\langle}y|C_{\gamma}|y{\rangle}$ for every $\gamma=2,\ldots,2d-1$ . The remaining condition ${\langle}x|C_{2d}|x{\rangle}={\langle}y|C_{2d}|y{\rangle}$ holds due to $\left\|x\right\|=\left\|y\right\|$ . Therefore it follows from Eq.(3) that the set $\{S_{j}\}$ is informationally complete w.r.t. the set of all pure states. ∎

It is possible to slightly improve the upper bound $\mathfrak{m}\left[\mathcal{P}_{1}\right]\leq 4d-5$ at the cost of having to deal with a more complicated set of operators introduced in the work of Milgram :

For odd $d$ this upper bound can be worse than the previously derived $4d-5$ , but for most dimensions it is below $4d-5$ . Notice that $\alpha$ satisfies $1\leq\alpha\leq\log_{2}(d)$ . We will see later that this improvement is, in fact, nearly optimal. With this refined upper bound we can calculate the exact value of $\mathfrak{m}\left[\mathcal{P}_{1}\right]$ for small dimensions $d$ in Sec. 4.2.

The construction of the operators is based on the work of Milgram . We will only argue why this corresponds to a proper measurement scheme rather than reproducing the rather cumbersome construction.

The set of ${m}$ selfadjoint matrices $T_{j}$ therefore leads to a measurement scheme which is informationally complete w.r.t. the set of pure states. ∎

3. Generic bounds depending on fractal dimension

For instance, if $\mathcal{P}$ is a smooth manifold of real dimension $d(\mathcal{P})$ , then $D(\mathcal{P})=d(\mathcal{P})$ .

Let $\mathcal{P}\subseteq\mathcal{S(H)}$ be a closed subset of the set of density matrices with Minkowski dimension $D(\mathcal{P})$ . Then almost any (in the Lebesgue measure sense) collection of $m>2D(\mathcal{P})$ selfadjoint operators is informationally complete w.r.t. $\mathcal{P}$ . In particular, $\mathfrak{m}\left[\mathcal{P}\right]\leq 2D(\mathcal{P})+1$ .

In principle, this bound can be refined to $m>\delta_{\mathcal{P}-\mathcal{P}}$ , where $\delta_{\mathcal{P}-\mathcal{P}}$ is the Hausdorff dimension of the set $\mathcal{P}-\mathcal{P}:=\{\varrho_{1}-\varrho_{2}|\varrho_{i}\in\mathcal{P}\}$ . This bound is generally better since $\delta_{\mathcal{P}-\mathcal{P}}\leq D(\mathcal{P}-\mathcal{P})\leq 2D(\mathcal{P})$ , but it may be more difficult to handle. We also mention that for the inverse mappings Hölder continuity can be proven and the respective constants can be bounded .

Let us apply Theorem 4 to the case of pure states. This is a smooth manifold of real dimension $2d-2$ , hence $D(\mathcal{P}_{1})=2d-2$ . Thus, almost any collection of $4d-3$ selfadjoint operators is informationally complete w.r.t. pure states.

Lower bounds

Let $\mathsf{A}$ be a POVM with $m+1$ outcomes. This induces a map

If $\mathcal{P}\subset\mathcal{S(H)}$ is a closed surface without boundary, then $\mathfrak{m}\left[\mathcal{P}\right]\geq 3$ . Moreover, if $\mathcal{P}$ is non-orientable in addition, then $\mathfrak{m}\left[\mathcal{P}\right]\geq 4$ .

The first three matrices in (8) give rise to measurement results which form the Roman surface displayed in Fig.2. The failure of informational completeness (due to disregarding the necessary fourth measurement) is reflected by self-intersections of the surface.

2. Measurements as diffeomorphisms

Manifolds of interest in quantum tomography often have a differentiable structure – they are smooth manifolds. In such a case we may resort to differential topology, which imposes more restrictive conditions on the existence of smooth embeddings. Before we apply these conditions to the concrete cases of pure states and states with general rank constraints we provide some general background.

will play an important role in the following.

To show that $h_{\mathsf{A}}$ is a smooth embedding we need to prove (i) that it is smooth, which follows from linearity, (ii) that it is a topological embedding, which follows from the assumed injectivity and Prop. 6 and (iii) that it has an injective derivative everywhere.

The inclusion $T_{p}(\mathcal{P})\subseteq\Delta(\mathcal{P})$ holds for all $p\in\mathcal{P}$ if $\mathcal{P}$ is the complex Grassmannian manifold $G(r,d-r)$ understood as the submanifold in the space of $d\times d$ selfadjoint matrices which consists of all orthogonal projections of rank $r$ .

is an element of $T_{p}(\mathcal{P})$ and in fact, such derivatives span the entire tangent space

In order to see this we have to show that they span a vector space which has the same dimension as the manifold (for which $D(\mathcal{P})=2r(d-r)$ ). To this end, note that there is a one-to-one relation between commutators and block off-diagonal matrices in the sense that we can always write

In a suitable basis any element $X\in T_{P}(\mathcal{P})$ is such that

since Eq.(11) allows us to work with the singular values $\{c_{i}\}$ of $C$ by transforming $X\mapsto(U\oplus V)X(U\oplus V)^{\dagger}$ with appropriate unitaries $U$ and $V$ . Setting $\lambda:=\max_{i}{c_{i}}$ equal to the operator norm of $X$ we complete the proof if we show that every $2\times 2$ matrix of the form $c\sigma_{x}$ with $c\in$ is a difference of two projections. This can seen to be true by taking the difference of two pure qubit states whose Bloch vectors are parameterized by $(c,\pm\sqrt{1-c^{2}},0)$ . ∎

where $\alpha$ denotes the number of $1$ ’s in the binary expansion of $d-1$ and $D=2d-2$ is the real dimension of the manifold.

If we now combine this non-embedding result with the upper bounds discussed in Subsection 3.2, we obtain a fairly comprehensive picture on the minimal number of measurements required to identify an element of the set of pure states. The bounds for the minimal number for the dimensions $2-10$ are summarized in the table below.

For all dimensions $d$ we can say that the difference between the upper and lower bounds is at most $\log_{2}(d)$ , and that the minimal number differs from the best affine upper bound $4d-5$ by at most $2\log_{2}(d)$ . The lower and upper bounds up to $d=30$ are presented in Fig.4.

Summary

How many measurement outcomes (i.e., POVM elements) are minimally needed in order to identify all quantum states from a given set? We have shown on the one hand that if the set is a manifold, then topological obstructions can increase the number of required measurements by a factor of two over the dimensionality of the manifold. On the other hand we have seen that this factor of two is sufficient even in a more general context where the considered set is not necessarily a manifold and its dimensionality is understood as its Minkowski dimension.

We have considered two different types of examples: 2-manifolds, where (non-)orientability plays an important role, and Grassmannian manifolds, which contain the set of all pure states as a particular instance.

For the latter case we have shown that upper and lower bounds—both obtained via topological embeddings—essentially match. In fact, they never differ by more than $\log_{2}(d)$ . To be more precise, their difference is strictly less than the number of ones appearing in the binary expansion of $d-1$ .

Points which are beyond the present work, albeit obviously of practical relevance, are the inversion algorithm and issues of robustness and certifiability. For instance, we do not know whether in the pure state case $m\sim 4d$ can be achieved in a way such that (i) the inversion is algorithmically efficient and (ii) the validity of the assumption behind the prior information is certifiable.

Acknowledgements

We thank Claudio Carmeli for pointing out an error in the earlier version of Theorem 3.

MMW was supported by the EU-STREP projects COQUIT and QUEVADIS and the Alfried Krupp von Bohlen und Halbach-Stiftung. TH was supported by the Emil Aaltonen Foundation, Alfred Kordelin Foundation and Academy of Finland (grant no. 1381359). TH and MMW are both grateful for the hospitality and the inspiring working environment at the Institut Mittag-Leffler in Stockholm, where this work was partly done.