^{1}

^{*}

^{2}

^{*}

^{3}

^{1}

^{2}

^{3}

Edited by: Jean-Luc Bouchot, Beijing Institute of Technology, China

Reviewed by: Nguyen Quang Tran, Aalto University, Finland; Shao-Bo Lin, Wenzhou University, China; Martin Lotz, University of Warwick, United Kingdom

This article was submitted to Mathematics of Computation and Data Science, a section of the journal Frontiers in Applied Mathematics and Statistics

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

Compressed sensing is the art of effectively reconstructing structured

Compressed sensing is the art of effectively reconstructing structured signals from substantially fewer measurements than would naively be required for standard techniques like least squares. Although not entirely novel, rigorous treatments of this observation [

_{2}(

Typically, results for type (i) precede the results for type (ii). Phase retrieval via PhaseLift is a concrete example for such a development. Generic convergence guarantees [

Here, we try to close this gap by applying a method that is very well established in theoretical computer science:

There is also a didactic angle to this work: within the realm of signal processing, partial-derandomization techniques have been successfully applied to matrix reconstruction [

Finally, one may argue that compressed sensing has not fully lived up to the high expectations of the community yet (see e.g., Tropp [^{1}

Compressed sensing aims at reconstructing ^{n} from

Since _{1}-norm

Strong mathematical proofs for correct recovery have been established. By and large, these statements require randomness in the sense that each row ^{n}. Prominent examples include

^{4}(_{f} ~ {_{1}, …, _{n}},

for ^{d}: ^{4}(_{h} ~ {_{1}, …, _{n}}.

A rigorous treatment of all these cases can be found in Foucart and Rauhut [_{2}(_{f}, _{h} require exponentially fewer random bits to generate than generic random vectors, like _{g}, _{sb}. Importantly, this transition from generic measurements to highly structured ones comes at a price. The number of measurements required in cases (3) and (4) scales poly-logarithmically in

The following two subsections are devoted to introducing formalisms that allow for partially de-randomizing signed Bernoulli vectors and complex standard Gaussian vectors, respectively.

Throughout this work, we endow ℂ^{n} with the standard inner product _{i} ~ {±1} chosen independently at random (Rademacher random variables). Then,

This feature is equivalent to

for all ^{n}.

Independent sign entries are sufficient, but not necessary for this feature. Indeed, suppose that ^{d} is a power of two. Then, the rows of a Sylvester Hadamard matrix correspond to a particular subset of _{i} ∈ {±1} of _{h} also obey (2), despite not being independent instances of random signs. This feature is called

_{i} ^{n} _{1}, …, _{n}

_{1}, …, _{k} ≤

For

What is more, explicit constructions for ^{n} are known for any

The first two columns summarize all possible tuples (

^{k}

The example from above is an 4 × 3 orthogonal array of strength 2. Strength-

This correspondence identifies orthogonal arrays as general-purpose seeds for pseudo-random behavior. What is more, explicit constructions of orthogonal arrays are known for any ^{n} possible elements of {±1}^{n} as its rows, these constructions typically only require

Let us now discuss another general-purpose tool for (partial) de-randomization. Concentration of measure implies that

for all ^{n}.

Here, d^{2}^{n−1}. The concept of

_{(k)}

(Spherical) ^{n−1} gives rise to Definition 3. Complex projective ^{n} are called

A prominent example for such a basis pair is the standard basis and the Fourier, or Hadamard, basis, respectively. One can show that at most

forms a MMUB. Importantly, MMUBs are always (proportional to) 2-designs [

form a MMUB. Here ^{2} vectors corresponds to all time-frequency shifts of a discrete Alltop sequence

In the following

^{n}

^{n}

This result readily generalizes to measurements that are sampled from a maximal set of mutually unbiased bases (excluding the standard basis). Time-frequency shifts of the Alltop sequence are one concrete construction that applies to prime dimensions only.

Note that the cardinality of all Alltop shifts is ^{2}. Hence, 2log_{2}(

random bits are required for sampling a complete measurement matrix

Highly structured families of vectors – such as rows of a Fourier, or Hadamard matrix – require even less randomness to sample from: only log_{2}(^{2}(

The recovery guarantees in Theorem 1 and Theorem 2 can be readily extended to ensure stability with respect to noise corruption in the measurements and robustness with respect to violations of the model assumption of sparsity. We refer to section 3 for details.

We also emphasize that there are results in the literature that establish compressed sensing guarantees with comparable, or even less, randomness. Obviously, deterministic constructions are the extreme case in this regard. Early deterministic results suffer from a “quadratic bottleneck.” The number of measurements must scale quadratically in the sparsity: ^{2}. Although this obstacle was overcome, existing progress is still comparatively mild. Bourgain et al. [^{2−ϵ}, where ϵ > 0 is a (very) small constant.

Closer in spirit to this work is Bandeira et al. [^{2}(

To date, the strongest de-randomized reconstruction guarantees hail from a close connection between

However, this strong result follows from “reducing” the problem of

This section is devoted to summarizing an elegant argument, originally by Rudelson and Vershynin [

In this work we are concerned with _{1}-regularization (1). A necessary pre-requisite for uniform recovery is the demand that the kernel, or

where _{1}-norm) one incurs when approximating

A matrix

The set _{s} is a subset of the unit sphere that contains all normalized

The following powerful statement allows for exploiting generic randomness in order to establish nullspace properties. It is originally due to Gordon [

^{3}^{n−1} ⊂ ℝ^{n}

^{−t2/2}.

This is a deep statement that connects random matrix theory to geometry: the Gaussian width is a rough measure of the size of the set ^{n−1}. Setting _{s} allows us to conclude that a matrix _{s}), we may use the following inclusion

see e.g., Kabanava and Rauhut [

because the linear function _{g}, _{s} of the convex set conv (Σ_{s}). The right-hand side of (10) is the expected supremum of a Gaussian process indexed by _{s}.

where _{s}. They are defined as the smallest cardinality of an

where

^{2}

This argument is exemplary for generic proof techniques: strong results from probability theory allow for establishing close-to-optimal recovery guarantees in a relatively succinct fashion.

The extended arguments presented here are largely due to Dirksen, Lecue and Rauhut [

Gordon's escape through a mesh is only valid for Gaussian random matrices

^{n}. ^{n},

_{i} ~ {±1}.

^{−t2/2}.

It is worthwhile to point out that for real-valued Gaussian vectors this result recovers Theorem 3 up to constants. Fix ξ > 0 of appropriate size. Then, ^{n−1} ensures that ξ _{2ξ}(_{g}, _{m}(_{g},

Mendelson's small ball method can be used to establish the nullspace property for independent random measurements ^{n} that exhibit

Signed Bernoulli vectors are a concrete example. Such random vectors are isotropic (3) and direct computation also reveals

because there are 3 possible pairings of four indices. Next, set _{2ξ}(_{sb}, _{s}) in Mendelson's small ball method from below:

This lower bound is constant for any ξ ∈ (0, 1/2).

Next, note that _{z} = 〈^{n}. This process is centered (_{z} = 0) and Equation (11) implies that it is also subgaussian. Moreover,

Fixing ξ > 0 sufficiently small, setting

A similar result remains valid for other classes of independent measurements with subgaussian marginals (11).

The nullspace property, as well as its connection to uniform

^{n}. ^{n}

^{−t2/2}.

Such a generalization was conjectured by Tropp [

Let us now turn to the main scope of this work: partial de-randomization. Effectively, Mendelson's small ball method reduces the task of establishing nullspace properties to bounding the two parameters _{m}(_{s}) in an appropriate fashion. A lower bound on the former readily follows from the Paley-Zygmund inequality, provided that the following relations hold for any ^{n}:

Here, _{4} > 0 is constant.

Indeed, inserting these bounds into the Paley-Zygmund inequality yields

In contrast, establishing an upper bound on _{m}(_{s}) via Dudley's inequality requires subgaussian marginals (11) (that must not depend on the ambient dimension). This implicitly imposes stringent constraints on

Here, _{1}, …, _{n} denotes the standard basis of ℂ^{n}. Incoherence has long been identified as a key ingredient for developing _{m}(_{s}) that does not rely on subgaussian marginals.

^{n}

This bound only requires an appropriate scaling of the first two moments (isotropy) but comes at a price. The bound scales logarithmically in

^{n}

In complete analogy to the real-valued case, the complex nullspace property ensures uniform recovery of ^{n} from

Suppose that _{i} of _{i}_{j}] = _{i}ϵ_{j}] = δ_{ij}, because 4-wise independence necessarily implies 2-wise independence. Isotropy then readily follows from (3). Finally, 4-wise independence suffices to establish the 4th moment bound. By assumption _{i}ϵ_{j}ϵ_{k}ϵ_{l}] and we may thus infer

In summary: _{oa} meets all the requirements of Theorem 8. Theorem 1 then follows from the fact that the complex nullspace property ensures uniform recovery of all

Suppose that

The vector _{mub} is chosen from a union of _{mub} is chosen uniformly from a union of

which implies isotropy. Finally, a maximal set of (^{n} this property ensures

which establishes the 4th moment bound. In summary, the random vector

The nullspace property may be generalized to address two imperfections in ^{d} may only be approximately sparse in the sense that it is well-approximated by an ^{m}.

To state this generalization, we need some additional notation. For ^{n} and

see e.g., Foucart and Rauhut [

Here, η > 0 denotes an upper bound on the strength of the noise corruption: ||_{ ℓ 2} ≤ η. [^{♯} ∈ ℂ^{n} to (16) is guaranteed to obey

where _{2} = (3 + ρ)τ/(1 − ρ). The first term on the r.h.s. vanishes if _{ℓ2} and vanishes in the absence of noise corruption.

In the previous section, we have established the classical nullspace property for measurements that are chosen independently from a vector distribution that is isotropic, incoherent and obeys a bound on the 4th moments. This argument may readily be extended to establish the robust nullspace property with relatively little extra effort. To this end, define the set

A moment of thought reveals that the matrix

What is more, the following inclusion formula is also valid:

see Kabanava and Rauhut [

Now, suppose that ^{−2}^{n}. Then, Theorem 7 readily asserts that with probability at least

where

^{−2}^{n} ^{♯}

_{1}, _{2} > 0

In this part we demonstrate the performance which can be achieved with our proposed derandomized constructions and we compare this to generic measurement matrices (Gaussian, signed Bernoulli). However, since the orthogonal array construction is more involved we first provide additional details relevant for numerical experiments.

^{k}, ^{k} × ^{k}-tuple occurs in exactly λ rows. Arrays with λ = 1 are called simple. A comprehensive treatment can be found in the book [^{4}_{σ} = {0, …, σ − 1} for concreteness. Such arrays can be represented as a matrix in ^{p} with ^{k}, ^{pt} rows of the matrix form a vector space over _{q}. The runs of an orthogonal array (the rows of the corresponding matrix) can also be interpreted as codewords of a code and vice versa. The array is linear if and only if the corresponding code is linear [

In this work we propose to generate sampling matrices ^{k} rows at random from an orthogonal array OA(λσ^{4}, _{2}(

Arrays that saturate this bound are called tight (or complete). In summary, an order of ^{2}(

^{2}, ^{2} + 2

_{2}-recovery error (NMSE) ^{♯} is the solution of (1). To construct the orthogonal array, algorithm [

(

The proof is based on rather straightforward modifications of Tropp's proof for Mendelson's small ball method [^{n} be a complex-valued random vector. Suppose that _{i}. The goal is to obtain a lower bound on ^{n} is an arbitrary set. First, note that ℓ_{1} and ℓ_{2} norms on ℝ^{2m} are related via

Next, fix ξ > 0 and introduce the indicator function

Also, note that the expectation value of each summand obeys

according to the union bound. The last line follows from a simple observation. Let

and note that the estimate from above ensures

Adding and subtracting

Here we have applied Equation (22) to bound the contribution of the first term. Since _{2ξ}(

and the vectors _{1}, …, _{m} are independent copies of a single random vector ^{n}. The bounded difference inequality [

Therefore, the union bound grants a transition from ^{−t2/2}. These expectation values can be further simplified. Define the soft indicator function

which admits the following bounds: _{ξ}(_{ξ}(_{ξ}(0) = 0. Rademacher symmetrization [

where

with probability at least 1 − 2e^{−t2/2}. Setting

The inclusion _{s} ⊂ 2conv(Σ_{s}) remains valid in the complex case. Moreover, every _{s}) necessarily obeys

because the maximum value of a convex function is achieved at the boundary. Hoelder's inequality therefore implies

where

and we may bound both expressions on the r.h.s. independently. For the first term, fix θ > 0 and use Jensen's inequality (the logarithm is a concave function) to obtain

Monotonicity and non-negativity of the exponential function then imply

where we have also used that all ϵ_{i}'s and _{i}'s are independent. The remaining moment generating functions can be bounded individually. Fix 1 ≤

because σ^{2} = 1. Incoherence moreover ensures

for any

and inserting this bound into (24) ensures

RK developed the technical aspects of the paper. PJ is responsible for the numerical aspects and the discussion of orthogonal arrays. Some of the main conceptual ideas are due to DM. These in particular include the idea of employing orthogonal arrays to partially derandomize signed Bernoulli matrices and the insight that a partial de-randomization of Gordon's escape through a mesh is essential for achieving the results. All authors contributed to the introduction and general presentation.

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The authors want to thank David Gross for providing the original ideas that ultimately led to this publication.

_{1}-recovery with frames and gaussian measurements

^{n}

^{1}Existing deterministic constructions (see e.g., Bandeira et al.[

^{2}For comparison, a complex standard Gaussian vector obeys

^{3}This is a

^{4}For example