Message-Passing Estimation from Quantized Samples

Ulugbek Kamilov, Vivek K. Goyal, Sundeep Rangan

I Introduction

Estimation of a signal from quantized samples is a fundamental problem in signal processing. It arises both from the discretization in digital acquisition devices and the quantization performed for lossy compression.

This paper focuses on using a simple message-passing algorithm based on belief propagation (BP). Implementation of BP for estimation of a continuous-valued quantity requires discretization of densities; this is inherently inexact and leads to high computational complexity. To handle quantization effects without any heuristic additive noise model and with low complexity, we use a recently-developed Gaussian-approximated BP algorithm, called generalized approximate message passing (GAMP) or relaxed belief propagation , which extends earlier methods to nonlinear output channels.

Gaussian approximations of loopy BP have previously been shown to be effective in several other applications ; for our application to estimation from quantized samples, the extension to general output channels is essential. Using this extension to nonlinear output channels, we show that GAMP-based estimation offer several key benefits:

General priors: GAMP-based estimation can incorporate a large class of priors on the components of $\mathbf{x}$ , provided that the components are independent. For example, in Section VIII, we will demonstrate the algorithm on recovery of vectors with sparse priors arising in quantized compressed sensing .

Exact characterization with random transforms: In the case of certain large random transforms $\mathbf{A}$ , the componentwise performance of GAMP-based estimation can be precisely predicted by a so-called state evolution (SE) analysis reviewed in Section VI. From the SE analysis, one can precisely evaluate any componentwise performance metric, including for example, mean-squared error (MSE). In contrast, works such as mentioned above have only obtained bounds or scaling laws.

Performance and optimality: Our simulations indicate significantly-improved performance over traditional methods for estimating from quantized samples in a range of scenarios. Moreover, for certain large random sparse transforms, the SE analysis provides testable conditions under which the GAMP reconstruction is provably optimal .

Computational simplicity: The GAMP algorithm is computationally extremely fast. Our simulation and SE analysis indicate good performance with a small number of iterations (10 to 20 in our experience), with the dominant computational cost per iteration simply being multiplication by $\mathbf{A}$ and $\mathbf{A}^{T}$ .

Applications to optimal quantizer design: When quantizer outputs are used as inputs to a nonlinear estimation algorithm, minimizing the MSE between quantizer inputs and outputs is generally not equivalent to minimizing the MSE of the final reconstruction . To optimize the quantizer for the GAMP algorithm, we use the fact that the MSE under large random mixing matrices $\mathbf{A}$ can be predicted accurately from a set of simple SE equations . Then, by modeling the quantizer as a part of the measurement channel, we use the SE formalism to optimize the quantizer to minimize the asymptotic distortion after the reconstruction by GAMP. Note that our use of random $\mathbf{A}$ is for rigor of the SE formalism; the effectiveness of GAMP does not depend on this.

I-B Outline

The remainder of the paper is organized as follows. Section II provides basic background material on quantization, compressed sensing, and belief propagation. Section III introduces the problem of estimating a random vector from quantized linear transform coefficients. It concentrates on geometric insights for both the oversampled and undersampled settings. The main results in this paper apply under a Bayesian formulation introduced in Section IV. Note that this Bayesian formulation does not require sparsity of the signal nor specify undersampling or oversampling. The use of generalized approximate message passing to find optimal estimates under this Bayesian formulation is derived in Section V. Section VI describes the use of SE to predict the performance of GAMP for our problem. Optimization of quantizers using SE is developed in Section VII, and experimental results are presented in Section VIII. Section IX concludes the paper.

I-C Notation

Vectors and matrices will be written in boldface type ( $\mathbf{A}$ , $\mathbf{x}$ , $\mathbf{y}$ , …) to distinguish from scalars written in normal weight ( $m$ , $n$ , …). Random and non-random quantities (or random variables and their realizations) are not distinguished typographically since the use of capital letters for random variables would conflict with the convention of using capital letters for matrices (or in the case of quantization, an operator on a vector rather than a scalar). The probability density function (p.d.f.) of random vector $\mathbf{x}$ is denoted $p_{\mathbf{x}}$ , and the conditional p.d.f. of $\mathbf{y}$ given $\mathbf{x}$ is denoted $p_{\mathbf{y}|\mathbf{x}}$ . When these densities are separable and identical across components, we repeat the previous notations: $p_{\mathbf{x}}$ for the scalar p.d.f. and $p_{\mathbf{y}|\mathbf{x}}$ for the scalar conditional p.d.f. Writing $x\sim\mathcal{N}(a,b)$ indicates that $x$ is a Gaussian random variable with mean $a$ and variance $b$ . The resulting p.d.f. is written as $p_{x}(t)=\phi(t\,;\,a,\,b)$ .

II Background

This section establishes concepts and notations central to the paper. For a comprehensive tutorial history of quantization, we recommend ; for an introduction to compressed sensing, ; and for the basics of belief propagation, .

A quantizer is called regular when each cell is a convex set, i.e., a single interval. Each cell of a regular scalar quantizer thus has a boundary of one point (if the cell is unbounded) or two points (if the cell is bounded). If the input to a quantizer is a continuous random variable, then the probability of the input being a boundary point is zero. Thus it suffices to specify the cells of a $K$ -point regular scalar quantizer by its decision thresholds $\{b_{i}\}_{i=0}^{K}$ , with $b_{0}=-\infty$ and $b_{K}=\infty$ . The encoder satisfies

and the output for boundary points can be safely ignored.

is called a binning function, labeling function, or index assignment. The binning function is not invertible.

The distortion of a quantizer $q$ applied to scalar random variable $x$ is typically measured by the MSE

A quantizer is called optimal at fixed rate $R=\log_{2}K$ when it minimizes distortion $D$ among all $K$ -level quantizers. To optimize scalar quantizers under MSE distortion, it suffices to consider only regular quantizers; a non-regular quantizer will never perform strictly better.

While regular quantizers are optimal for the standard lossy compression problem, non-regular quantizers are sometimes useful when some information aside from $q(x)$ is available when estimating $x$ . Two key examples are Wyner–Ziv coding and multiple description coding . One method for Wyner–Ziv coding is to apply Slepian–Wolf coding across a block of samples after regular scalar quantization ; the Slepian–Wolf coding is binning, but across a block rather than for a single scalar. In multiple description scalar quantization , two binning functions are used that together are invertible but individually are not. In these uses of non-regular quantizers, side information aids in recovering $x$ with resolution commensurate with $K^{\prime}$ while the rate is only commensurate with $K$ , with $K^{\prime}>K$ .

Optimization of a quantizer can rarely be done exactly or analytically. One standard way of optimizing $q$ is via the Lloyd algorithm, which iteratively updates the decision boundaries and output levels by applying necessary conditions for quantizer optimality.

II-B Compressed Sensing

Conventionally, one does not attempt to estimate an $n$ -dimensional signal $\mathbf{x}$ from fewer than $n$ scalar quantities; it would not seem to work from a simple counting of degrees of freedom. Compressed sensing (CS) encapsulates a variety of techniques for estimating $\mathbf{x}$ from $m<n$ scalar linear measurements, possibly including some noise, by exploiting knowledge that $\mathbf{x}$ is sparse or approximately sparse in some given transform domain. Measurements are of the form

In this paper, we simplify notation and expressions by assuming that $\mathbf{x}$ itself is sparse or approximately sparse without requiring the use of a transform domain. Also, since our focus is on estimation in the presence of degradation of measurements caused by quantization, we do not consider further the noiseless measurement model (1).

The most commonly-studied estimator for the measurement model (2) is the lasso estimator

where algorithm parameter $\gamma>0$ trades off data fidelity against sparsity of the solution. This may be interpreted as a Lagrangian form of the estimator

which could be justified heuristically by $\|\mathbf{d}\|_{2}^{2}\leq\epsilon$ .

Most of the CS literature has considered signal recovery with no noise or with $\|\mathbf{d}\|_{2}^{2}\leq\epsilon$ . However, in many practical applications, measurements have to be discretized to a finite number of bits. The effect of such quantization on the performance of CS reconstruction has been studied in . In , high-resolution functional scalar quantization theory was used to design quantizers for lasso estimation. Better yet is to change the reconstruction algorithm: In , the authors demonstrate that when $\mathbf{d}$ represents quantization error,

While convex optimization formulations are prominent in CS, estimation with generic convex program solvers often has excessively high computational cost. Thus, there is significant interest in greedy and iterative methods. The use of belief propagation for CS estimation was first proposed in ; however, as explained in Section II-C, belief propagation has high complexity for the estimation of continuous-valued quantities. Lower-complexity approximations to belief propagation were first proposed for CS estimation in . To handle the effects of quantization precisely, in this paper we use the generalization of the technique of developed by Rangan .

II-C Belief Propagation

where $Z$ is the normalization constant and $z_{i}=(\mathbf{A}\mathbf{x})_{i}$ . In principle, it is possible to estimate each $x_{j}$ by marginalizing this distribution.

Belief propagation replaces the computationally intractable direct marginalization of $p_{\mathbf{x}\mid\mathbf{y}}$ with an iterative algorithm. To apply BP, construct a bipartite factor graph $G=(V,F,E)$ from (3) and pass the following messages along the edges $E$ of the graph:

where $\propto$ means that the distribution is to be normalized so that it has unit integral and integration is over all the elements of $\mathbf{x}$ except $x_{j}$ . We refer to messages $\{\mu_{i\leftarrow j}\}_{(i,j)\in E}$ as variable updates and to messages $\{\mu_{i\rightarrow j}\}_{(i,j)\in E}$ as measurement updates. BP is initialized by setting $\mu_{i\leftarrow j}^{0}(x_{j})=p_{\mathbf{x}}(x_{j})$ .

Earlier works on BP reconstruction have shown that it is asymptotically MSE optimal under certain verifiable conditions. These conditions involve simple single-dimensional recursive equations called state evolution (SE), which predicts that BP is optimal when the corresponding SE admits a unique fixed point . However, direct implementation of BP is impractical due to the dense structure of $\mathbf{A}$ , which implies that the algorithm must compute the marginal of a high-dimensional distribution at each measurement node; i.e., the integration in (4b) is over many variables. Furthermore, integration must be approximated through some discrete quadrature rule.

BP can be simplified through various Gaussian approximations, including the relaxed BP method and approximate message passing (AMP) . Recent theoretical work and extensive numerical experiments have demonstrated that, in the case of certain large random measurement matrices, the error performance of both relaxed BP and AMP can also be accurately predicted by SE.

III Quantized Linear Expansions

This paper focuses on the general quantized measurement abstraction of

Commonly-used linear reconstruction forms estimate

where $\mathbf{A}^{\dagger}=(\mathbf{A}^{T}\mathbf{A})^{-1}\mathbf{A}^{T}$ is the pseudoinverse of $\mathbf{A}$ . Under several reasonable models, linear reconstruction has MSE inversely proportional to $m$ . For example, suppose the frame is uniform and tight and $\mathbf{x}$ is an unknown deterministic quantity. By modeling scalar quantization $y_{i}=q_{i}(z_{i})$ with an additive noise as

one can compute the MSE to be $n\sigma_{d}^{2}/m$ .

Even when the model (7) is accurate , the linear reconstruction (6) may be far from optimal. More sophisticated algorithms have focused on enforcing consistency of an estimate with the quantized samples. A nonlinear estimate may exploit the boundedness of the sets

which we call single-sample consistent sets. Assuming for now that scalar quantizer $q_{i}$ is regular and its cells are bounded, the boundary of $\mathcal{S}_{i}(y_{i})$ is two parallel hyperplanes. The full set of hyperplanes obtained for one index $i$ by varying $y_{i}$ over the output levels of $q_{i}$ is called a hyperplane wave partition , as illustrated for a uniform quantizer in Figure 1(a). The set enclosed by two neighboring hyperplanes in a hyperplane wave partition is called a slab; one slab is shaded in Figure 1(a). Intersecting $\mathcal{S}_{i}(y_{i})$ for $n$ distinct indexes specifies an $n$ -dimensional parallelotope as illustrated in Figure 1(b). Using more than $n$ of these single-sample consistent sets restricts $\mathbf{x}$ to a finer partition, as illustrated in Figure 1(c) for $m=3$ .

is called the consistent set. Since each $\mathcal{S}_{i}(y_{i})$ is convex, one may reach $\mathcal{S}(\mathbf{y})$ asymptotically through a sequence of projections onto $\mathcal{S}_{i}(y_{i})$ using each infinitely often .

Full consistency is not necessary for optimal MSE dependence on $m$ . It was shown in that $O(m^{-2})$ MSE is guaranteed for a simple algorithm that uses each $\mathcal{S}_{i}(y_{i})$ only once, recursively, under mild conditions on randomized selection of $\{\mathbf{a}_{i}\}_{i=1}^{m}$ . These results were strengthened and extended to deterministic frames in .

Quantized overcomplete expansions arise naturally in acquisition subsystems such as ADCs, where $m/n$ represents oversampling factor relative to Nyquist rate. In such systems, high oversampling factor may be motivated by a trade-off between MSE and power consumption or manufacturing cost: within certain bounds, faster sampling is cheaper than a higher number of quantization bits per sample . However, high oversampling does not give a good trade-off between MSE and raw number of bits produced by the acquisition system: combining the proportionality of bit rate $R$ to number of samples $m$ with the best-case $\Theta(m^{-2})$ MSE, we obtain $\Theta(R^{-2})$ MSE; this is poor compared to the exponential decrease of MSE with $R$ obtained with scalar quantization of Nyquist-rate samples.

Ordinarily, the bit-rate inefficiency of the raw output is made irrelevant by recoding, at or near Nyquist rate, soon after acquisition or within the ADC. An alternative explored in this paper is to combat this bit-rate inefficiency through the use of non-regular quantization.

III-B Non-Regular Quantization

The bit-rate inefficiency of the raw output with regular quantization is easily understood with reference to Figure 1(c). After $y_{1}$ and $y_{2}$ are fixed, $\mathbf{x}$ is known to lie in the intersection of the shaded strips. Only four values of $y_{3}$ are possible (i.e., the solid hyperplane wave breaks $\mathcal{S}_{1}(1)\cap\mathcal{S}_{2}(0)$ into four cells), and bits are wasted if this is not exploited in the representation of $y_{3}$ .

Recall the discussion of generating a non-regular quantizer by using a binning function $\lambda$ in Section II-A. Binning does not change the boundaries of the single-sample consistent sets, but it makes these sets unions of slabs that may not even be connected. Thus, while binning reduces the quantization rate, in the absence of side information that specifies which slab contains $\mathbf{x}$ (at least with moderately high probability), it increases distortion significantly. The increase in distortion is due to ambiguity among slabs. Taking $m>n$ quantized samples together may provide adequate information to disambiguate among slabs, thus removing the distortion penalty.

The key concepts in the use of non-regular quantization are illustrated in Figure 2. Suppose one quantized sample $y_{1}$ specifies a single-sample consistent set $\mathcal{S}_{1}(y_{1})$ composed of two slabs, such as the shaded region in Figure 2(a). A second quantized sample $y_{2}$ will not disambiguate between the two slabs. In the example shown in Figure 2(b), $\mathcal{S}_{2}(y_{2})$ is composed of two slabs, and $\mathcal{S}_{1}(y_{1})\cap\mathcal{S}_{2}(y_{2})$ is the union of four connected sets. A third quantized sample $y_{3}$ may now completely disambiguate; the particular example of $\mathcal{S}_{3}(y_{3})$ shown in Figure 2(c) makes $\mathcal{S}=\mathcal{S}_{1}(y_{1})\cap\mathcal{S}_{2}(y_{2})\cap\mathcal{S}_{3}(y_{3})$ a single convex set.

When the quantized samples together completely disambiguate the slabs as in the example, the rate reduction from binning comes with no increase in distortion. The price to pay comes in complexity of estimation.

The use of binned quantization of linear expansions was introduced in , where the only reconstruction method proposed is intractable in high dimensions because it is combinatorial over the binning functions. Specifically, using the notation from Section II-A, let the quantizer forming $y_{i}$ be defined by $(\alpha_{i},\beta_{i},\lambda_{i})$ . Then $\lambda_{i}^{-1}(\beta_{i}^{-1}(y_{i}))$ will be a set of possible values of $\alpha_{i}(z_{i})$ specified by $y_{i}$ . One can try every combination, i.e., element of

to seek a consistent estimate. If the binning is effective, most combinations yield an empty consistent set; if the slabs are disambiguated, exactly one combination yields a non-empty set, which is then the consistent set $\mathcal{S}$ . This technique has complexity exponential in $m$ (assuming non-trivial binning). The recent manuscript provides bounds on reconstruction error for consistent estimation with binned quantization; it does not address algorithms for reconstruction.

This paper provides a tractable and effective method for reconstruction from a quantized linear expansion with non-regular quantizers. To the best of our knowledge, this is the first such method.

III-C Undercomplete Expansions

Only one quantized measurement $y_{1}$ is not adequate to specify $\mathcal{J}$ , as shown in Figure 3(a) by the fact that a single shaded cell intersects all the subspaces.Intersections with two subspaces are shown within the range of the diagram. Two quantized measurements together will usually specify $\mathcal{J}$ , as shown in Figure 3(b) by the fact that only one subspace intersects the specified square cell; for fixed scalar quantizers, ambiguity becomes less likely as $k$ decreases, $n$ increases, $m$ increases, or $\|x\|$ increases. Figure 3(c) shows a case where non-regular (binned) quantization still allows unambiguous determination of $\mathcal{J}$ .

The naïve reconstruction method implied by Figure 3(c) is to search combinatorially over both $\mathcal{J}$ and the combinations in (8); this is extremely complex. While the use of binning for quantized undercomplete expansions of sparse signals has appeared in the literature, first in and later in , to the best of our knowledge this paper is the first to provide a tractable and effective reconstruction method.

IV Estimation from Quantized Samples: Bayesian Formulation

We now specify more explicitly the class of problems for which we derive new estimation algorithms. Generalizing (5), let

Our primary interest is in the case of $\sigma^{2}=0$ , but allowing a nontrivial distribution for $\mathbf{w}$ is not only more general but also makes the derivations more clear.

V Generalized Approximate Message Passing for a Quantizer Output Channel

The acquisition model (9) is suitable for GAMP estimation under the conditions in after one simple observation: the mapping from $\mathbf{z}$ to $\mathbf{y}$ is a separable probabilistic mapping with identical marginals. Specifically, quantized measurement $y_{i}$ indicates $s_{i}\in q_{i}^{-1}(y_{i})$ , so each component output channel can be characterized as

GAMP can be derived by approximating the updates in (4) by two scalar parameters each and introducing some first-order approximations, as discussed in . Then given the estimation functions $F_{\textrm{in}}$ , $\mathcal{E}_{\textrm{in}}$ , $D_{1}$ , and $D_{2}$ described below, for each iteration $t=0,\,1,\,2,\,\dots$ , the GAMP algorithm produces estimates $\widehat{\mathbf{x}}^{t}$ of the true signal $\mathbf{x}$ according to the following rules:

Note that in (10) the notation $\mathbf{A}^{2}$ denotes the element-wise product of a matrix with itself, i.e. $(\mathbf{A}^{2})_{ij}=(\mathbf{A}_{ij})^{2}$ . The estimation functions $F_{\textrm{in}}$ , $\mathcal{E}_{\textrm{in}}$ , $D_{1}$ , and $D_{2}$ described below are applied to their inputs component-by-component.

We refer to messages $\{\hat{x}_{j},\hat{\tau}_{j}\}_{j\in V}$ as variable updates and to messages $\{u_{i},\tau_{i}\}_{i\in F}$ as measurement updates. The algorithm is initialized by setting $\hat{x}^{0}_{j}=\hat{x}_{\textrm{init}}$ , $\hat{\tau}^{0}_{j}=\hat{\tau}_{\textrm{init}}$ , and $u_{i}^{-1}=0$ , where $\hat{x}_{\textrm{init}}$ and $\hat{\tau}_{\textrm{init}}$ are the mean and variance of the prior $p_{\mathbf{x}}$ . The nonlinear functions $F_{\textrm{in}}$ and $\mathcal{E}_{\textrm{in}}$ are the conditional mean and variance

where $q=x+v$ with $x\sim p_{\mathbf{x}}$ and $v\sim\mathcal{N}(0,\nu)$ . Note that these functions can easily be evaluated numerically for any given values of $q$ and $\sigma^{2}$ . Similarly, the functions $D_{1}$ and $D_{2}$ can be computed via

where the functions $F_{\textrm{out}}$ and $\mathcal{E}_{\textrm{out}}$ are the conditional mean and variance

VI State Evolution for GAMP

The equations (10) are easy to implement, however they provide us no insight into the performance of the algorithm. The goal of SE equations is to describe the asymptotic behavior of GAMP under large random measurement matrices $\mathbf{A}$ .

The SE for our setting in Figure 4 is given by the recursion

where $t\geq 0$ is the iteration number, $\beta={{n}/{m}}$ is a fixed number denoting the measurement ratio, and $\sigma^{2}$ is the variance of the additive white Gaussian noise (AWGN), which is also fixed. We initialize the recursion by setting $\bar{\tau}_{0}=\hat{\tau}_{\textrm{init}}$ , where ${\tau}_{\textrm{init}}$ is the variance of $x_{j}$ according to the prior $p_{\mathbf{x}}$ . We define the function $\bar{\mathcal{E}}_{\textrm{in}}$ as

where the expectation is taken over the scalar random variable $q=x+v$ , with $x\sim p_{\mathbf{x}}$ and $v\sim\mathcal{N}(0,\nu)$ . Similarly, the function $\bar{D}_{2}$ is defined as

where $D_{2}$ is given by (11b) and the expectation is taken over $p_{\mathbf{y}\mid\mathbf{z}}$ and $(z,\hat{z})\sim\mathcal{N}(0,P_{z}(\nu))$ , with the covariance matrix

One of the main results of , which is an extension of the analysis in , was to demonstrate the convergence of the error performance of the GAMP algorithm to the SE equations. Specifically, these works consider the case where $\mathbf{A}$ is an i.i.d. Gaussian matrix, $\mathbf{x}$ is i.i.d. with a prior $p_{X}$ and $m,n\rightarrow\infty$ with $n/m\rightarrow\beta$ . Then, under some further technical conditions, it is shown that for any fixed iteration number $t$ , the empirical joint distribution of the components $(x_{j},\widehat{x}^{t}_{j})$ of the unknown vector $\mathbf{x}$ and its estimate $\widehat{\mathbf{x}}^{t}$ converges to a simple scalar equivalent model parameterized by the outputs of the SE equations. From the scalar equivalent model, one can compute any asymptotic componentwise performance metric. It can be shown, in particular, that the asymptotic MSE is given simply by $\bar{\tau}_{t}$ . That is,

Thus, $\bar{\tau}_{t}$ can be used as a metric for the design and analysis of the quantizer, although other non-squared error distortions could also be considered. Details are provided in .

The analysis in and are for large i.i.d. Gaussian matrices. For certain large sparse random matrices, results in and show that the same SE equation holds and, in fact, additionally provide testable conditions under which GAMP is provably optimal. Specifically, it is shown that the SE recursion in (13) always admits at least one fixed point. As $t\rightarrow\infty$ the recursion decreases monotonically to its largest fixed point and, if the SE admits a unique fixed point, then GAMP is asymptotically mean-square optimal.

VII Quantizer Optimization

The SE description of GAMP performance facilitates the desired optimization. By modeling the quantizer as part of the channel and working out the resulting equations for GAMP and SE, we can make use of the convergence result (17) to recast our optimization problem to

where minimization is done over all $K$ -level regular scalar quantizers. Based on (17), the optimization is equivalent to finding the quantizer that minimizes the asymptotic MSE. In the optimization (18), we have considered the limit in the iterations, $t\rightarrow\infty$ . One can also consider the optimization with a finite $t$ , although our simulations exhibit close to the limiting performance with a relatively small number of iterations.

It is important to note that the SE recursion behaves well under quantizer optimization. This is due to the fact that SE is independent of actual output levels and small changes in the quantizer boundaries result in only minor change in the recursion (see (12b)). Although closed-form expressions for the derivatives of $\bar{\tau}_{t}$ for large $t$ ’s are difficult to obtain, we can approximate them by using finite difference methods. Finally, the recursion itself is fast to evaluate, which makes the scheme in (18) practically realizable under standard optimization methods.

VIII Experimental Results

Consider overcomplete expansion of $\mathbf{x}$ as discussed in Section III-A. We generate the signal $\mathbf{x}$ with i.i.d. elements from the standard Gaussian distribution $x_{j}\sim\mathcal{N}(0,1)$ . We form the measurement matrix $\mathbf{A}$ from i.i.d. zero-mean Gaussian random variables. To concentrate on the degradation due to quantization we assume noiseless measurement model (5); i.e., $\sigma^{2}=0$ in (9).

Figure 5 presents squared-error performance of three estimation algorithms while varying the oversampling ratio $m/n$ and holding $n=100$ . To generate the plot we considered estimation from measurements discretized by a $16$ -level regular uniform quantizer. We set the granular region of the quantizer to $[-3\sigma_{z},3\sigma_{z}]$ , where $\sigma_{z}^{2}=n/m$ is the variance of the measurements. For each value of $m/n$ , 200 random realizations of the problem were generated; the curves show the median-squared error performance over these 200 Monte Carlo trials. We compare error performance of GAMP against two other common reconstruction methods: linear MMSE and maximum a posteriori probability (MAP). The MAP estimator was implemented using quadratic programming (QP).

With Figure 6 we turn to a comparison among quantizers, all with GAMP reconstruction, $n=100$ , $m=200$ , and $\mathbf{x}$ and $\mathbf{A}$ distributed as above. To demonstrate the improvement in rate–distortion performance that is possible with non-regular quantizers, we consider simple uniform modulo quantizers

We compare three types of quantizers: those optimized for MSE of the measurements (not the overall reconstruction MSE) using Lloyd’s algorithm , regular uniform quantizers with loading factors optimized for reconstruction MSE using SE analysis, and (non-regular) uniform modulo quantizers with $\Delta$ optimized for reconstruction MSE using SE analysis. The last two quantizers were obtained by solving (18) via the standard SQP method found in MATLAB. The uniform modulo quantizer achieves the best rate–distortion performance, while the performance of the quantizer designed with Lloyd’s algorithm is comparatively poor. The stark non-optimality of the latter is due to the fact that it optimizes the MSE only between quantizer inputs and outputs, ignoring the nonlinear estimation algorithm following the quantizer.

It is important to point out that, without methods such as GAMP, estimation with a modulo quantizer such as (19) is not even computationally possible in works such as , since the consistent set is non-convex and consists of a disjoint union of convex sets. Beyond the performance improvements, we believe that GAMP provides the first computationally-tractable and systematic method for such non-convex quantization reconstruction problems.

VIII-B Compressive Sensing with Quantized Measurements

We next consider estimation of an $n$ -dimensional sparse signal $\mathbf{x}$ from $m<n$ random measurements—a problem considered in quantized compressed sensing . We assume that the signal $\mathbf{x}$ is generated with i.i.d. elements from the Gauss–Bernoulli distribution

where $\rho$ is the sparsity ratio that represents the average fraction of nonzero components of $\mathbf{x}$ . In the following experiments we assume $\rho=1/32$ . Similarly to the overcomplete case, we form the measurement matrix $\mathbf{A}$ from i.i.d. Gaussian random variables and we assume no additive noise ( $\sigma^{2}=0$ in (9)).

Figure 7 compares MSE performance of GAMP with three other standard reconstruction methods. In particular, we consider linear MMSE and the Basis Pursuit DeNoise (BPDN) program

We obtain the curves by varying the ratio $m/n$ and holding $n=1024$ . We perform estimation from measurements obtained from a $16$ -level regular uniform quantizer with granular region of length $2\|\mathbf{A}\mathbf{x}\|_{\infty}$ centered at the origin.

The figure plots the median of the squared error from 1000 Monte Carlo trials for each value of $m/n$ . For basis pursuit methods we optimize the parameter $\epsilon$ for the best squared error performance; in practice this oracle-aided performance would not be achieved. The top curve (worst performance) is for linear MMSE estimation; and middle curves are for the basis pursuit estimators BPDN and BPDQ with moment $p=4$ . As expected, BPDQ achieves a notable $2$ dB reduction in MSE compared to BPDN for high values of $m$ , however GAMP significantly outperforms both methods over the whole range of $m/n$ .

In Figure 8, we compare the performance of GAMP under three quantizers consider before: those optimized for MSE of the measurements using Lloyd’s algorithm, and regular and non-regular quantizers optimized for reconstruction MSE using SE analysis. We assume the same $\mathbf{x}$ and $\mathbf{A}$ distributions as above. We plot MSE of the reconstruction against the rate measured in bits per component of $\mathbf{x}$ . For each rate and for each quantizer, we vary the ratio $m/n$ for the best possible performance. We see that, in comparison to regular quantizers, binned quantizers with GAMP estimation achieve much lower distortions for the same rates. This indicates that binning can be an effective strategy to favorably shift rate–distortion performance of the estimation.

IX Conclusions

We have presented generalized approximate message passing as an effective and efficient algorithm for estimation from quantized linear measurements. The GAMP methodology is general, allowing essentially arbitrary priors and quantization functions. In particular, GAMP is the first tractable and effective method for high-dimensional estimation problems involving non-regular scalar quantization. In addition, the algorithm is computationally extremely simple and, in the case of large random transforms, admits a precise performance characterization using a state evolution analysis.

The problem formulation is Bayesian, with an i.i.d. prior over the components of the signal of interest $\mathbf{x}$ ; the prior may or may not induce sparsity of $\mathbf{x}$ . Also, the number of measurements may be more or less than the dimension of $\mathbf{x}$ , and the quantizers applied to the linear measurements may be regular or not. Experiments show significant performance improvement over traditional reconstruction schemes, some of which have higher computational complexity. Moreover, using extensions of GAMP such as hybrid approximate message passing , one may also in the future be able to consider quantization of more general classes of signals described by general graphical models. MATLAB code for experiments with GAMP is available online .

Despite the improvements demonstrated here, we are not advocating quantized linear expansions as a compression technique—for the oversampled case or the undersampled sparse case; thus, comparisons to rate–distortion bounds would obscure the contribution. For regular quantizers and some fixed oversampling $\beta=m/n>1$ , the MSE decay with increasing rate is $\sim 2^{-2R/\beta}$ , worse than the $\sim 2^{-2R}$ distortion–rate bound. For a discussion of achieving exponential decay of MSE with increasing oversampling, while the quantization step size is held constant, see . For the undersampled sparse case, discusses the difficulty of recovering the support from quantized samples and the consequent difficulty of obtaining near-optimal rate–distortion performance. Performance loss rooted in the use of a random transformation $\mathbf{A}$ is discussed in .