Off-grid Direction of Arrival Estimation Using Sparse Bayesian Inference
Zai Yang, Lihua Xie, Cishen Zhang
Introduction
Recent advancements in array signal processing include compressive (CS-) MUSIC and subspace-augmented (SA-) MUSIC . They are combinations of the conventional MUSIC technique and recent CS methods with guaranteed support recovery performance and can outperform MUSIC and standard CS approaches. Though existing CS-based approaches have shown their improvements in DOA estimation, e.g., their success in the case of limited snapshots, there are still difficulties in practical situations where the true DOAs are not on the sampling grid. On one hand, a dense sampling grid is necessary for accurate DOA estimation to reduce the gap between the true DOA and its nearest grid point since the estimated DOAs are constrained on the grid. On the other hand, a dense sampling grid leads to a highly coherent matrix that violates the condition for the sparse signal recovery. We refer to the model adopted in the standard CS methods as an on-grid model hereafter in the sense that the estimated DOAs are constrained on the fixed grid.
An off-grid model for DOA estimation is studied in where the estimated DOAs are no longer constrained in the sampling grid set. The model takes into account the basis mismatch in the measurement matrix caused by the off-grid DOAs. It has been shown in that the sparse total least squares (STLS) solver proposed in can yield an MAP optimal estimate if the matrix perturbation caused by the basis mismatch is Gaussian. However, we show in this paper that the Gaussian condition cannot be satisfied in the off-grid DOA estimation problem and hence a new solver is needed.
The rest of the paper is organized as follows. Section 2 studies the off-grid model used in this paper. Section 3 introduces the proposed OGSBI and OGSBI-SVD algorithms. Section 4 presents our simulation results. Section 5 concludes this paper.
Off-grid DOA Estimation Model
Consider narrowband far-field sources , , impinging on an array of omnidirectional sensors from directions , . Time delays at different sensors can be represented by simple phase shifts, leading to the observation model:
where , , , , and and , , are the output and measurement noise of the th sensor at time respectively. The matrix is an array manifold matrix and is called steering vector of the th source. The entry contains the delay information of the th source to the th sensor. In this paper, we assume that the number of sources is already known. Readers are referred to a preprint version for discussions on the case of unknown . So, the goal is to find the unknown DOAs given , and the mapping . In the following we re-derive the off-grid model proposed in using linear approximation and further show its relationship with the on-grid one.
which is the off-grid model to be used in this paper. This model will be empirically validated in Subsection 4.1 by showing that the total noise (approximation error plus measurement noise) follows the Gaussian distribution with high probability if the measurement noise is Gaussian.
It should be noted that the off-grid model in (4) is closely related to the on-grid one that can be obtained by setting in (4) (). In fact, the off-grid model can be considered as the first order approximation of the true observation model while the on-grid one is the zeroth order approximation. As a result, the off-grid model has a much smaller modeling error than the on-grid one. Such an advantage is twofold. First, by adopting the same sampling grid the off-grid model results in higher accuracy, especially in the case of a low measurement noise where the modeling error is the dominant modeling uncertainty. Second, a coarser sampling grid can be adopted in the off-grid model to achieve a considerably reduced computational workload with a comparable modeling accuracy.
To estimate the DOAs we need to find not only the support of the sparse signals , , but also the off-grid difference . In this paper, we formulate the problem based on a Bayesian perspective and develop an iterative algorithm to jointly estimate , , and in the following section.
OGSBI: Off-grid Sparse Bayesian Inference
We consider complex-valued signals throughout the paper since the matrix is complex-valued. We derive our algorithm in the MMV case. The SMV is a special case by simply setting . Denote , and . The off-grid DOA estimation model in (4) becomes
Under an assumption of white (circular symmetric) complex Gaussian noises, we have
where denotes the noise precision with being the noise variance, the probability density function (PDF) of a (circular symmetric) complex Gaussian distributed random variable with mean and covariance is
In this paper we assume that the noise precision is unknown. A Gamma hyperprior is assumed for since it is a conjugate prior of the Gaussian distribution:
where with being the Gamma function. We set as in to obtain a broad hyperprior.
1.2 Sparse signal model
It is easy to show that all columns of are independent and share the same prior. According to , for both and are Laplace distributed and share the same PDF that is strongly peaked at the origin. As a result, the two-stage hierarchical prior is a sparse prior that favors most rows of being zeros.
1.3 Off-grid distance model
We assume a uniform prior for :
The prior is noninformative in the sense that the only information of we use is its boundedness.
By combining the stages of the hierarchical Bayesian model, the joint PDF is
with the distributions on the right hand side as defined by (8), (10), (11), (9) and (12) respectively.
2 Bayesian Inference
An evidence procedure is exploited to perform the Bayesian inference since the exact posterior distribution cannot be explicitly calculated. Similar approaches have been used in standard Bayesian CS methods . First it is easy to show that the posterior distribution of is a complex Gaussian distribution:
Calculations of and , , need estimates of the hyperparameters , and . In an evidence procedure, they are estimated using an MAP estimate that maximizes . It can be easily observed that to maximize is equivalent to maximizing the joint PDF since is independent of the hyperparameters. An expectation-maximization (EM) algorithm is implemented that treats as a hidden variable and turns to maximizing , where is given in (13) and denotes an expectation with respect to the posterior of as given in (14) using the current estimates of the hyperparameters.
Denote , , , and . Following a similar procedure as in , it is easy to obtain the following updates of and :
where , with .
For , its estimate maximizes by (13) and thus minimizes
where is a constant term independent of , is a positive semi-definite matrix and
The detailed derivation of (19) is provided in Appendix for simplicity of exposition. As a result, we have
Though an explicit expression of cannot be given, by recognizing that is jointly sparse with , the dimension of can be reduced to in the computation and hence can be efficiently calculated. We provide the details in Subsection 3.5.
The proposed OGSBI algorithm is implemented as follows. After initializations of the hyperparameters , and , we calculate and , , using the current values of the hyperparameters according to (16) and (15) respectively. Then we update , and according to (17), (18) and (22) respectively. The process is repeated until some convergence criterion is satisfied. We note that OGSBI has guaranteed convergence since the function is guaranteed to increase at each iteration by the property of EM algorithm.
3 OGSBI-SVD
In (23), , and can be viewed as the new matrices of sensor measurements, source signals and measurement noises respectively. The joint sparsity still holds in . We do not exploit possible correlations that exist between columns of (and in ), i.e., we still assume that (and ) have independent columns.The correlations between columns of the signal matrix ( in our case) have recently been studied in . It is then straightforward to apply the proposed OGSBI algorithm to estimate , and then the DOAs. We use OGSBI-SVD to refer to the resulting algorithm.
Based on implementation details to be introduced in Subsection 3.5, it can be shown that OGSBI-SVD has a computational complexity of order per iteration while that for OGSBI is per iteration. An additional computational workload of order is for the SVD of in OGSBI-SVD. Since it is empirically found that OGSBI-SVD converges much faster than OGSBI, the whole computational workload of OGSBI-SVD is less than that of OGSBI in general.A possible exception happens in the case of where the computation for the SVD is quite heavy. A modified approach in such a case is to partition firstly into blocks with each of about columns, then operate the SVD on each block and keep the resulting signal subspaces, and finally do another SVD on the new signal matrix composed of all signal subspaces. A model similar to (23) can be cast.
4 Source Power and DOA Estimation
5 Implementation Details
By the fact that is jointly sparse with whose nonzero entries correspond to the locations of the sources, we calculate only entries of that correspond to locations of the maximum entries of and set others to zeros. As a result, , and can be truncated into dimension of or . We still use , and hereafter to denote their truncated versions for simplicity. By (22) and we have if is invertible and . Otherwise, we update elementwise, i.e., at each step we update one by fixing up the other entries of . For , first we let
where is without the th entry for a vector . Then by constraining we have
It is easy to show that the objective function is guaranteed to decrease at each step with defined in (26).
Numerical simulations
2 Comparison with STLS
The off-grid model has recently been used in for DOA estimation. In , a sparse total least-squares (STLS) approach is proposed. In the SMV case, STLS seeks to solve the nonconvex optimization problem
In our experiment, we consider two DOAs from and with dB. We consider and for both OGSBI and STLS. The parameter in (27) is tuned to our best such that STLS achieves the smallest error. Table 2 presents the averaged MSEs and CPU times of STLS and OGSBI over trials. OGSBI obtains more accurate DOA estimations than STLS in both the scenarios with remarkably less computational times. We also note that it is possible to accelerate STLS using state-of-the-art algorithms for CS.
3 Sensitivity to Measurement Outliers
The SVD procedure in OGSBI-SVD is related to the principal component analysis (PCA). As is known that the standard PCA is sensitive to outliers. Even a single corrupted measurement can deteriorate the quality of the approximation. In this subsection we carry out experiments to check whether the proposed OGSBI-SVD is sensitive to measurement outliers due to the SVD. The experimental setup is similar to that in Subsection 4.1 but with . After acquiring the noiseless measurements, we randomly choose 3 out of the measurements, multiply by a constant ratio and then save as the outliers. Beside the case of no outliers () we consider five other cases where is set to 5, 10, 20, 50 and 100 respectively. Table 3 presents our simulation results of the MSEs. It can be seen that the estimation accuracy of OGSBI-SVD can degrade significantly even with about measurements being corrupted due to the sensitivity of the SVD.
Note that the corrupted measurement matrix due to the outliers is a sum of a low-rank matrix (noiseless measurement matrix of rank ) and a sparse matrix (outliers). A robust PCA technique has recently been proposed in that can recover the original low-rank matrix from the sparse outliers. So, it is possible to combine the robust PCA technique in with the proposed OGSBI-SVD to improve its robustness to outliers, which, however, is beyond the scope of this paper.
Conclusion
Appendix: Derivation of (19)
Denote . Eq. (19) is based on the following two equalities:
where , are constants independent of , and the equality