Message-Passing Estimation from Quantized Samples

Ulugbek Kamilov, Vivek K. Goyal, Sundeep Rangan

I Introduction

Estimation of a signal from quantized samples is a fundamental problem in signal processing. It arises both from the discretization in digital acquisition devices and the quantization performed for lossy compression.

This paper focuses on using a simple message-passing algorithm based on belief propagation (BP). Implementation of BP for estimation of a continuous-valued quantity requires discretization of densities; this is inherently inexact and leads to high computational complexity. To handle quantization effects without any heuristic additive noise model and with low complexity, we use a recently-developed Gaussian-approximated BP algorithm, called generalized approximate message passing (GAMP) or relaxed belief propagation , which extends earlier methods to nonlinear output channels.

Gaussian approximations of loopy BP have previously been shown to be effective in several other applications ; for our application to estimation from quantized samples, the extension to general output channels is essential. Using this extension to nonlinear output channels, we show that GAMP-based estimation offer several key benefits:

General priors: GAMP-based estimation can incorporate a large class of priors on the components of x\mathbf{x}, provided that the components are independent. For example, in Section VIII, we will demonstrate the algorithm on recovery of vectors with sparse priors arising in quantized compressed sensing .

Exact characterization with random transforms: In the case of certain large random transforms A\mathbf{A}, the componentwise performance of GAMP-based estimation can be precisely predicted by a so-called state evolution (SE) analysis reviewed in Section VI. From the SE analysis, one can precisely evaluate any componentwise performance metric, including for example, mean-squared error (MSE). In contrast, works such as mentioned above have only obtained bounds or scaling laws.

Performance and optimality: Our simulations indicate significantly-improved performance over traditional methods for estimating from quantized samples in a range of scenarios. Moreover, for certain large random sparse transforms, the SE analysis provides testable conditions under which the GAMP reconstruction is provably optimal .

Computational simplicity: The GAMP algorithm is computationally extremely fast. Our simulation and SE analysis indicate good performance with a small number of iterations (10 to 20 in our experience), with the dominant computational cost per iteration simply being multiplication by A\mathbf{A} and AT\mathbf{A}^{T}.

Applications to optimal quantizer design: When quantizer outputs are used as inputs to a nonlinear estimation algorithm, minimizing the MSE between quantizer inputs and outputs is generally not equivalent to minimizing the MSE of the final reconstruction . To optimize the quantizer for the GAMP algorithm, we use the fact that the MSE under large random mixing matrices A\mathbf{A} can be predicted accurately from a set of simple SE equations . Then, by modeling the quantizer as a part of the measurement channel, we use the SE formalism to optimize the quantizer to minimize the asymptotic distortion after the reconstruction by GAMP. Note that our use of random A\mathbf{A} is for rigor of the SE formalism; the effectiveness of GAMP does not depend on this.

I-B Outline

The remainder of the paper is organized as follows. Section II provides basic background material on quantization, compressed sensing, and belief propagation. Section III introduces the problem of estimating a random vector from quantized linear transform coefficients. It concentrates on geometric insights for both the oversampled and undersampled settings. The main results in this paper apply under a Bayesian formulation introduced in Section IV. Note that this Bayesian formulation does not require sparsity of the signal nor specify undersampling or oversampling. The use of generalized approximate message passing to find optimal estimates under this Bayesian formulation is derived in Section V. Section VI describes the use of SE to predict the performance of GAMP for our problem. Optimization of quantizers using SE is developed in Section VII, and experimental results are presented in Section VIII. Section IX concludes the paper.

I-C Notation

Vectors and matrices will be written in boldface type (A\mathbf{A}, x\mathbf{x}, y\mathbf{y}, …) to distinguish from scalars written in normal weight (mm, nn, …). Random and non-random quantities (or random variables and their realizations) are not distinguished typographically since the use of capital letters for random variables would conflict with the convention of using capital letters for matrices (or in the case of quantization, an operator on a vector rather than a scalar). The probability density function (p.d.f.) of random vector x\mathbf{x} is denoted pxp_{\mathbf{x}}, and the conditional p.d.f. of y\mathbf{y} given x\mathbf{x} is denoted pyxp_{\mathbf{y}|\mathbf{x}}. When these densities are separable and identical across components, we repeat the previous notations: pxp_{\mathbf{x}} for the scalar p.d.f. and pyxp_{\mathbf{y}|\mathbf{x}} for the scalar conditional p.d.f. Writing xN(a,b)x\sim\mathcal{N}(a,b) indicates that xx is a Gaussian random variable with mean aa and variance bb. The resulting p.d.f. is written as px(t)=ϕ(t;a,b)p_{x}(t)=\phi(t\,;\,a,\,b).

II Background

This section establishes concepts and notations central to the paper. For a comprehensive tutorial history of quantization, we recommend ; for an introduction to compressed sensing, ; and for the basics of belief propagation, .

A quantizer is called regular when each cell is a convex set, i.e., a single interval. Each cell of a regular scalar quantizer thus has a boundary of one point (if the cell is unbounded) or two points (if the cell is bounded). If the input to a quantizer is a continuous random variable, then the probability of the input being a boundary point is zero. Thus it suffices to specify the cells of a KK-point regular scalar quantizer by its decision thresholds {bi}i=0K\{b_{i}\}_{i=0}^{K}, with b0=b_{0}=-\infty and bK=b_{K}=\infty. The encoder satisfies

and the output for boundary points can be safely ignored.

is called a binning function, labeling function, or index assignment. The binning function is not invertible.

The distortion of a quantizer qq applied to scalar random variable xx is typically measured by the MSE

A quantizer is called optimal at fixed rate R=log2KR=\log_{2}K when it minimizes distortion DD among all KK-level quantizers. To optimize scalar quantizers under MSE distortion, it suffices to consider only regular quantizers; a non-regular quantizer will never perform strictly better.

While regular quantizers are optimal for the standard lossy compression problem, non-regular quantizers are sometimes useful when some information aside from q(x)q(x) is available when estimating xx. Two key examples are Wyner–Ziv coding and multiple description coding . One method for Wyner–Ziv coding is to apply Slepian–Wolf coding across a block of samples after regular scalar quantization ; the Slepian–Wolf coding is binning, but across a block rather than for a single scalar. In multiple description scalar quantization , two binning functions are used that together are invertible but individually are not. In these uses of non-regular quantizers, side information aids in recovering xx with resolution commensurate with KK^{\prime} while the rate is only commensurate with KK, with K>KK^{\prime}>K.

Optimization of a quantizer can rarely be done exactly or analytically. One standard way of optimizing qq is via the Lloyd algorithm, which iteratively updates the decision boundaries and output levels by applying necessary conditions for quantizer optimality.

II-B Compressed Sensing

Conventionally, one does not attempt to estimate an nn-dimensional signal x\mathbf{x} from fewer than nn scalar quantities; it would not seem to work from a simple counting of degrees of freedom. Compressed sensing (CS) encapsulates a variety of techniques for estimating x\mathbf{x} from m<nm<n scalar linear measurements, possibly including some noise, by exploiting knowledge that x\mathbf{x} is sparse or approximately sparse in some given transform domain. Measurements are of the form

In this paper, we simplify notation and expressions by assuming that x\mathbf{x} itself is sparse or approximately sparse without requiring the use of a transform domain. Also, since our focus is on estimation in the presence of degradation of measurements caused by quantization, we do not consider further the noiseless measurement model (1).

The most commonly-studied estimator for the measurement model (2) is the lasso estimator

where algorithm parameter γ>0\gamma>0 trades off data fidelity against sparsity of the solution. This may be interpreted as a Lagrangian form of the estimator

which could be justified heuristically by d22ϵ\|\mathbf{d}\|_{2}^{2}\leq\epsilon.

Most of the CS literature has considered signal recovery with no noise or with d22ϵ\|\mathbf{d}\|_{2}^{2}\leq\epsilon. However, in many practical applications, measurements have to be discretized to a finite number of bits. The effect of such quantization on the performance of CS reconstruction has been studied in . In , high-resolution functional scalar quantization theory was used to design quantizers for lasso estimation. Better yet is to change the reconstruction algorithm: In , the authors demonstrate that when d\mathbf{d} represents quantization error,

While convex optimization formulations are prominent in CS, estimation with generic convex program solvers often has excessively high computational cost. Thus, there is significant interest in greedy and iterative methods. The use of belief propagation for CS estimation was first proposed in ; however, as explained in Section II-C, belief propagation has high complexity for the estimation of continuous-valued quantities. Lower-complexity approximations to belief propagation were first proposed for CS estimation in . To handle the effects of quantization precisely, in this paper we use the generalization of the technique of developed by Rangan .

II-C Belief Propagation

where ZZ is the normalization constant and zi=(Ax)iz_{i}=(\mathbf{A}\mathbf{x})_{i}. In principle, it is possible to estimate each xjx_{j} by marginalizing this distribution.

Belief propagation replaces the computationally intractable direct marginalization of pxyp_{\mathbf{x}\mid\mathbf{y}} with an iterative algorithm. To apply BP, construct a bipartite factor graph G=(V,F,E)G=(V,F,E) from (3) and pass the following messages along the edges EE of the graph:

where \propto means that the distribution is to be normalized so that it has unit integral and integration is over all the elements of x\mathbf{x} except xjx_{j}. We refer to messages {μij}(i,j)E\{\mu_{i\leftarrow j}\}_{(i,j)\in E} as variable updates and to messages {μij}(i,j)E\{\mu_{i\rightarrow j}\}_{(i,j)\in E} as measurement updates. BP is initialized by setting μij0(xj)=px(xj)\mu_{i\leftarrow j}^{0}(x_{j})=p_{\mathbf{x}}(x_{j}).

Earlier works on BP reconstruction have shown that it is asymptotically MSE optimal under certain verifiable conditions. These conditions involve simple single-dimensional recursive equations called state evolution (SE), which predicts that BP is optimal when the corresponding SE admits a unique fixed point . However, direct implementation of BP is impractical due to the dense structure of A\mathbf{A}, which implies that the algorithm must compute the marginal of a high-dimensional distribution at each measurement node; i.e., the integration in (4b) is over many variables. Furthermore, integration must be approximated through some discrete quadrature rule.

BP can be simplified through various Gaussian approximations, including the relaxed BP method and approximate message passing (AMP) . Recent theoretical work and extensive numerical experiments have demonstrated that, in the case of certain large random measurement matrices, the error performance of both relaxed BP and AMP can also be accurately predicted by SE.

III Quantized Linear Expansions

This paper focuses on the general quantized measurement abstraction of

Commonly-used linear reconstruction forms estimate

where A=(ATA)1AT\mathbf{A}^{\dagger}=(\mathbf{A}^{T}\mathbf{A})^{-1}\mathbf{A}^{T} is the pseudoinverse of A\mathbf{A}. Under several reasonable models, linear reconstruction has MSE inversely proportional to mm. For example, suppose the frame is uniform and tight and x\mathbf{x} is an unknown deterministic quantity. By modeling scalar quantization yi=qi(zi)y_{i}=q_{i}(z_{i}) with an additive noise as

one can compute the MSE to be nσd2/mn\sigma_{d}^{2}/m .

Even when the model (7) is accurate , the linear reconstruction (6) may be far from optimal. More sophisticated algorithms have focused on enforcing consistency of an estimate with the quantized samples. A nonlinear estimate may exploit the boundedness of the sets

which we call single-sample consistent sets. Assuming for now that scalar quantizer qiq_{i} is regular and its cells are bounded, the boundary of Si(yi)\mathcal{S}_{i}(y_{i}) is two parallel hyperplanes. The full set of hyperplanes obtained for one index ii by varying yiy_{i} over the output levels of qiq_{i} is called a hyperplane wave partition , as illustrated for a uniform quantizer in Figure 1(a). The set enclosed by two neighboring hyperplanes in a hyperplane wave partition is called a slab; one slab is shaded in Figure 1(a). Intersecting Si(yi)\mathcal{S}_{i}(y_{i}) for nn distinct indexes specifies an nn-dimensional parallelotope as illustrated in Figure 1(b). Using more than nn of these single-sample consistent sets restricts x\mathbf{x} to a finer partition, as illustrated in Figure 1(c) for m=3m=3.

is called the consistent set. Since each Si(yi)\mathcal{S}_{i}(y_{i}) is convex, one may reach S(y)\mathcal{S}(\mathbf{y}) asymptotically through a sequence of projections onto Si(yi)\mathcal{S}_{i}(y_{i}) using each infinitely often .

Full consistency is not necessary for optimal MSE dependence on mm. It was shown in that O(m2)O(m^{-2}) MSE is guaranteed for a simple algorithm that uses each Si(yi)\mathcal{S}_{i}(y_{i}) only once, recursively, under mild conditions on randomized selection of {ai}i=1m\{\mathbf{a}_{i}\}_{i=1}^{m}. These results were strengthened and extended to deterministic frames in .

Quantized overcomplete expansions arise naturally in acquisition subsystems such as ADCs, where m/nm/n represents oversampling factor relative to Nyquist rate. In such systems, high oversampling factor may be motivated by a trade-off between MSE and power consumption or manufacturing cost: within certain bounds, faster sampling is cheaper than a higher number of quantization bits per sample . However, high oversampling does not give a good trade-off between MSE and raw number of bits produced by the acquisition system: combining the proportionality of bit rate RR to number of samples mm with the best-case Θ(m2)\Theta(m^{-2}) MSE, we obtain Θ(R2)\Theta(R^{-2}) MSE; this is poor compared to the exponential decrease of MSE with RR obtained with scalar quantization of Nyquist-rate samples.

Ordinarily, the bit-rate inefficiency of the raw output is made irrelevant by recoding, at or near Nyquist rate, soon after acquisition or within the ADC. An alternative explored in this paper is to combat this bit-rate inefficiency through the use of non-regular quantization.

III-B Non-Regular Quantization

The bit-rate inefficiency of the raw output with regular quantization is easily understood with reference to Figure 1(c). After y1y_{1} and y2y_{2} are fixed, x\mathbf{x} is known to lie in the intersection of the shaded strips. Only four values of y3y_{3} are possible (i.e., the solid hyperplane wave breaks S1(1)S2(0)\mathcal{S}_{1}(1)\cap\mathcal{S}_{2}(0) into four cells), and bits are wasted if this is not exploited in the representation of y3y_{3}.

Recall the discussion of generating a non-regular quantizer by using a binning function λ\lambda in Section II-A. Binning does not change the boundaries of the single-sample consistent sets, but it makes these sets unions of slabs that may not even be connected. Thus, while binning reduces the quantization rate, in the absence of side information that specifies which slab contains x\mathbf{x} (at least with moderately high probability), it increases distortion significantly. The increase in distortion is due to ambiguity among slabs. Taking m>nm>n quantized samples together may provide adequate information to disambiguate among slabs, thus removing the distortion penalty.

The key concepts in the use of non-regular quantization are illustrated in Figure 2. Suppose one quantized sample y1y_{1} specifies a single-sample consistent set S1(y1)\mathcal{S}_{1}(y_{1}) composed of two slabs, such as the shaded region in Figure 2(a). A second quantized sample y2y_{2} will not disambiguate between the two slabs. In the example shown in Figure 2(b), S2(y2)\mathcal{S}_{2}(y_{2}) is composed of two slabs, and S1(y1)S2(y2)\mathcal{S}_{1}(y_{1})\cap\mathcal{S}_{2}(y_{2}) is the union of four connected sets. A third quantized sample y3y_{3} may now completely disambiguate; the particular example of S3(y3)\mathcal{S}_{3}(y_{3}) shown in Figure 2(c) makes S=S1(y1)S2(y2)S3(y3)\mathcal{S}=\mathcal{S}_{1}(y_{1})\cap\mathcal{S}_{2}(y_{2})\cap\mathcal{S}_{3}(y_{3}) a single convex set.

When the quantized samples together completely disambiguate the slabs as in the example, the rate reduction from binning comes with no increase in distortion. The price to pay comes in complexity of estimation.

The use of binned quantization of linear expansions was introduced in , where the only reconstruction method proposed is intractable in high dimensions because it is combinatorial over the binning functions. Specifically, using the notation from Section II-A, let the quantizer forming yiy_{i} be defined by (αi,βi,λi)(\alpha_{i},\beta_{i},\lambda_{i}). Then λi1(βi1(yi))\lambda_{i}^{-1}(\beta_{i}^{-1}(y_{i})) will be a set of possible values of αi(zi)\alpha_{i}(z_{i}) specified by yiy_{i}. One can try every combination, i.e., element of

to seek a consistent estimate. If the binning is effective, most combinations yield an empty consistent set; if the slabs are disambiguated, exactly one combination yields a non-empty set, which is then the consistent set S\mathcal{S}. This technique has complexity exponential in mm (assuming non-trivial binning). The recent manuscript provides bounds on reconstruction error for consistent estimation with binned quantization; it does not address algorithms for reconstruction.

This paper provides a tractable and effective method for reconstruction from a quantized linear expansion with non-regular quantizers. To the best of our knowledge, this is the first such method.

III-C Undercomplete Expansions

Only one quantized measurement y1y_{1} is not adequate to specify J\mathcal{J}, as shown in Figure 3(a) by the fact that a single shaded cell intersects all the subspaces.Intersections with two subspaces are shown within the range of the diagram. Two quantized measurements together will usually specify J\mathcal{J}, as shown in Figure 3(b) by the fact that only one subspace intersects the specified square cell; for fixed scalar quantizers, ambiguity becomes less likely as kk decreases, nn increases, mm increases, or x\|x\| increases. Figure 3(c) shows a case where non-regular (binned) quantization still allows unambiguous determination of J\mathcal{J}.

The naïve reconstruction method implied by Figure 3(c) is to search combinatorially over both J\mathcal{J} and the combinations in (8); this is extremely complex. While the use of binning for quantized undercomplete expansions of sparse signals has appeared in the literature, first in and later in , to the best of our knowledge this paper is the first to provide a tractable and effective reconstruction method.

IV Estimation from Quantized Samples: Bayesian Formulation

We now specify more explicitly the class of problems for which we derive new estimation algorithms. Generalizing (5), let

Our primary interest is in the case of σ2=0\sigma^{2}=0, but allowing a nontrivial distribution for w\mathbf{w} is not only more general but also makes the derivations more clear.

V Generalized Approximate Message Passing for a Quantizer Output Channel

The acquisition model (9) is suitable for GAMP estimation under the conditions in after one simple observation: the mapping from z\mathbf{z} to y\mathbf{y} is a separable probabilistic mapping with identical marginals. Specifically, quantized measurement yiy_{i} indicates siqi1(yi)s_{i}\in q_{i}^{-1}(y_{i}), so each component output channel can be characterized as

GAMP can be derived by approximating the updates in (4) by two scalar parameters each and introducing some first-order approximations, as discussed in . Then given the estimation functions FinF_{\textrm{in}}, Ein\mathcal{E}_{\textrm{in}}, D1D_{1}, and D2D_{2} described below, for each iteration t=0,1,2,t=0,\,1,\,2,\,\dots, the GAMP algorithm produces estimates x^t\widehat{\mathbf{x}}^{t} of the true signal x\mathbf{x} according to the following rules:

Note that in (10) the notation A2\mathbf{A}^{2} denotes the element-wise product of a matrix with itself, i.e. (A2)ij=(Aij)2(\mathbf{A}^{2})_{ij}=(\mathbf{A}_{ij})^{2}. The estimation functions FinF_{\textrm{in}}, Ein\mathcal{E}_{\textrm{in}}, D1D_{1}, and D2D_{2} described below are applied to their inputs component-by-component.

We refer to messages {x^j,τ^j}jV\{\hat{x}_{j},\hat{\tau}_{j}\}_{j\in V} as variable updates and to messages {ui,τi}iF\{u_{i},\tau_{i}\}_{i\in F} as measurement updates. The algorithm is initialized by setting x^j0=x^init\hat{x}^{0}_{j}=\hat{x}_{\textrm{init}}, τ^j0=τ^init\hat{\tau}^{0}_{j}=\hat{\tau}_{\textrm{init}}, and ui1=0u_{i}^{-1}=0, where x^init\hat{x}_{\textrm{init}} and τ^init\hat{\tau}_{\textrm{init}} are the mean and variance of the prior pxp_{\mathbf{x}}. The nonlinear functions FinF_{\textrm{in}} and Ein\mathcal{E}_{\textrm{in}} are the conditional mean and variance

where q=x+vq=x+v with xpxx\sim p_{\mathbf{x}} and vN(0,ν)v\sim\mathcal{N}(0,\nu). Note that these functions can easily be evaluated numerically for any given values of qq and σ2\sigma^{2}. Similarly, the functions D1D_{1} and D2D_{2} can be computed via

where the functions FoutF_{\textrm{out}} and Eout\mathcal{E}_{\textrm{out}} are the conditional mean and variance

VI State Evolution for GAMP

The equations (10) are easy to implement, however they provide us no insight into the performance of the algorithm. The goal of SE equations is to describe the asymptotic behavior of GAMP under large random measurement matrices A\mathbf{A}.

The SE for our setting in Figure 4 is given by the recursion

where t0t\geq 0 is the iteration number, β=n/m\beta={{n}/{m}} is a fixed number denoting the measurement ratio, and σ2\sigma^{2} is the variance of the additive white Gaussian noise (AWGN), which is also fixed. We initialize the recursion by setting τˉ0=τ^init\bar{\tau}_{0}=\hat{\tau}_{\textrm{init}}, where τinit{\tau}_{\textrm{init}} is the variance of xjx_{j} according to the prior pxp_{\mathbf{x}}. We define the function Eˉin\bar{\mathcal{E}}_{\textrm{in}} as

where the expectation is taken over the scalar random variable q=x+vq=x+v, with xpxx\sim p_{\mathbf{x}} and vN(0,ν)v\sim\mathcal{N}(0,\nu). Similarly, the function Dˉ2\bar{D}_{2} is defined as

where D2D_{2} is given by (11b) and the expectation is taken over pyzp_{\mathbf{y}\mid\mathbf{z}} and (z,z^)N(0,Pz(ν))(z,\hat{z})\sim\mathcal{N}(0,P_{z}(\nu)), with the covariance matrix

One of the main results of , which is an extension of the analysis in , was to demonstrate the convergence of the error performance of the GAMP algorithm to the SE equations. Specifically, these works consider the case where A\mathbf{A} is an i.i.d. Gaussian matrix, x\mathbf{x} is i.i.d. with a prior pXp_{X} and m,nm,n\rightarrow\infty with n/mβn/m\rightarrow\beta. Then, under some further technical conditions, it is shown that for any fixed iteration number tt, the empirical joint distribution of the components (xj,x^jt)(x_{j},\widehat{x}^{t}_{j}) of the unknown vector x\mathbf{x} and its estimate x^t\widehat{\mathbf{x}}^{t} converges to a simple scalar equivalent model parameterized by the outputs of the SE equations. From the scalar equivalent model, one can compute any asymptotic componentwise performance metric. It can be shown, in particular, that the asymptotic MSE is given simply by τˉt\bar{\tau}_{t}. That is,

Thus, τˉt\bar{\tau}_{t} can be used as a metric for the design and analysis of the quantizer, although other non-squared error distortions could also be considered. Details are provided in .

The analysis in and are for large i.i.d. Gaussian matrices. For certain large sparse random matrices, results in and show that the same SE equation holds and, in fact, additionally provide testable conditions under which GAMP is provably optimal. Specifically, it is shown that the SE recursion in (13) always admits at least one fixed point. As tt\rightarrow\infty the recursion decreases monotonically to its largest fixed point and, if the SE admits a unique fixed point, then GAMP is asymptotically mean-square optimal.

VII Quantizer Optimization

The SE description of GAMP performance facilitates the desired optimization. By modeling the quantizer as part of the channel and working out the resulting equations for GAMP and SE, we can make use of the convergence result (17) to recast our optimization problem to

where minimization is done over all KK-level regular scalar quantizers. Based on (17), the optimization is equivalent to finding the quantizer that minimizes the asymptotic MSE. In the optimization (18), we have considered the limit in the iterations, tt\rightarrow\infty. One can also consider the optimization with a finite tt, although our simulations exhibit close to the limiting performance with a relatively small number of iterations.

It is important to note that the SE recursion behaves well under quantizer optimization. This is due to the fact that SE is independent of actual output levels and small changes in the quantizer boundaries result in only minor change in the recursion (see (12b)). Although closed-form expressions for the derivatives of τˉt\bar{\tau}_{t} for large tt’s are difficult to obtain, we can approximate them by using finite difference methods. Finally, the recursion itself is fast to evaluate, which makes the scheme in (18) practically realizable under standard optimization methods.

VIII Experimental Results

Consider overcomplete expansion of x\mathbf{x} as discussed in Section III-A. We generate the signal x\mathbf{x} with i.i.d. elements from the standard Gaussian distribution xjN(0,1)x_{j}\sim\mathcal{N}(0,1). We form the measurement matrix A\mathbf{A} from i.i.d. zero-mean Gaussian random variables. To concentrate on the degradation due to quantization we assume noiseless measurement model (5); i.e., σ2=0\sigma^{2}=0 in (9).

Figure 5 presents squared-error performance of three estimation algorithms while varying the oversampling ratio m/nm/n and holding n=100n=100. To generate the plot we considered estimation from measurements discretized by a 1616-level regular uniform quantizer. We set the granular region of the quantizer to [3σz,3σz][-3\sigma_{z},3\sigma_{z}], where σz2=n/m\sigma_{z}^{2}=n/m is the variance of the measurements. For each value of m/nm/n, 200 random realizations of the problem were generated; the curves show the median-squared error performance over these 200 Monte Carlo trials. We compare error performance of GAMP against two other common reconstruction methods: linear MMSE and maximum a posteriori probability (MAP). The MAP estimator was implemented using quadratic programming (QP).

With Figure 6 we turn to a comparison among quantizers, all with GAMP reconstruction, n=100n=100, m=200m=200, and x\mathbf{x} and A\mathbf{A} distributed as above. To demonstrate the improvement in rate–distortion performance that is possible with non-regular quantizers, we consider simple uniform modulo quantizers

We compare three types of quantizers: those optimized for MSE of the measurements (not the overall reconstruction MSE) using Lloyd’s algorithm , regular uniform quantizers with loading factors optimized for reconstruction MSE using SE analysis, and (non-regular) uniform modulo quantizers with Δ\Delta optimized for reconstruction MSE using SE analysis. The last two quantizers were obtained by solving (18) via the standard SQP method found in MATLAB. The uniform modulo quantizer achieves the best rate–distortion performance, while the performance of the quantizer designed with Lloyd’s algorithm is comparatively poor. The stark non-optimality of the latter is due to the fact that it optimizes the MSE only between quantizer inputs and outputs, ignoring the nonlinear estimation algorithm following the quantizer.

It is important to point out that, without methods such as GAMP, estimation with a modulo quantizer such as (19) is not even computationally possible in works such as , since the consistent set is non-convex and consists of a disjoint union of convex sets. Beyond the performance improvements, we believe that GAMP provides the first computationally-tractable and systematic method for such non-convex quantization reconstruction problems.

VIII-B Compressive Sensing with Quantized Measurements

We next consider estimation of an nn-dimensional sparse signal x\mathbf{x} from m<nm<n random measurements—a problem considered in quantized compressed sensing . We assume that the signal x\mathbf{x} is generated with i.i.d. elements from the Gauss–Bernoulli distribution

where ρ\rho is the sparsity ratio that represents the average fraction of nonzero components of x\mathbf{x}. In the following experiments we assume ρ=1/32\rho=1/32. Similarly to the overcomplete case, we form the measurement matrix A\mathbf{A} from i.i.d. Gaussian random variables and we assume no additive noise (σ2=0\sigma^{2}=0 in (9)).

Figure 7 compares MSE performance of GAMP with three other standard reconstruction methods. In particular, we consider linear MMSE and the Basis Pursuit DeNoise (BPDN) program

We obtain the curves by varying the ratio m/nm/n and holding n=1024n=1024. We perform estimation from measurements obtained from a 1616-level regular uniform quantizer with granular region of length 2Ax2\|\mathbf{A}\mathbf{x}\|_{\infty} centered at the origin.

The figure plots the median of the squared error from 1000 Monte Carlo trials for each value of m/nm/n. For basis pursuit methods we optimize the parameter ϵ\epsilon for the best squared error performance; in practice this oracle-aided performance would not be achieved. The top curve (worst performance) is for linear MMSE estimation; and middle curves are for the basis pursuit estimators BPDN and BPDQ with moment p=4p=4. As expected, BPDQ achieves a notable 22 dB reduction in MSE compared to BPDN for high values of mm, however GAMP significantly outperforms both methods over the whole range of m/nm/n.

In Figure 8, we compare the performance of GAMP under three quantizers consider before: those optimized for MSE of the measurements using Lloyd’s algorithm, and regular and non-regular quantizers optimized for reconstruction MSE using SE analysis. We assume the same x\mathbf{x} and A\mathbf{A} distributions as above. We plot MSE of the reconstruction against the rate measured in bits per component of x\mathbf{x}. For each rate and for each quantizer, we vary the ratio m/nm/n for the best possible performance. We see that, in comparison to regular quantizers, binned quantizers with GAMP estimation achieve much lower distortions for the same rates. This indicates that binning can be an effective strategy to favorably shift rate–distortion performance of the estimation.

IX Conclusions

We have presented generalized approximate message passing as an effective and efficient algorithm for estimation from quantized linear measurements. The GAMP methodology is general, allowing essentially arbitrary priors and quantization functions. In particular, GAMP is the first tractable and effective method for high-dimensional estimation problems involving non-regular scalar quantization. In addition, the algorithm is computationally extremely simple and, in the case of large random transforms, admits a precise performance characterization using a state evolution analysis.

The problem formulation is Bayesian, with an i.i.d. prior over the components of the signal of interest x\mathbf{x}; the prior may or may not induce sparsity of x\mathbf{x}. Also, the number of measurements may be more or less than the dimension of x\mathbf{x}, and the quantizers applied to the linear measurements may be regular or not. Experiments show significant performance improvement over traditional reconstruction schemes, some of which have higher computational complexity. Moreover, using extensions of GAMP such as hybrid approximate message passing , one may also in the future be able to consider quantization of more general classes of signals described by general graphical models. MATLAB code for experiments with GAMP is available online .

Despite the improvements demonstrated here, we are not advocating quantized linear expansions as a compression technique—for the oversampled case or the undersampled sparse case; thus, comparisons to rate–distortion bounds would obscure the contribution. For regular quantizers and some fixed oversampling β=m/n>1\beta=m/n>1, the MSE decay with increasing rate is 22R/β\sim 2^{-2R/\beta}, worse than the 22R\sim 2^{-2R} distortion–rate bound. For a discussion of achieving exponential decay of MSE with increasing oversampling, while the quantization step size is held constant, see . For the undersampled sparse case, discusses the difficulty of recovering the support from quantized samples and the consequent difficulty of obtaining near-optimal rate–distortion performance. Performance loss rooted in the use of a random transformation A\mathbf{A} is discussed in .

References