On the singularity probability of discrete random matrices
Jean Bourgain, Van Vu, Philip Matchett Wood
Introduction
Let be a large integer and be an by random matrix whose entries are independent (but not necessarily identically distributed) discrete random variables taking values in the complex numbers. The problem of estimating the probability that is singular is a basic problem in the theory of random matrices and combinatorics. The goal of this paper is to give a bound that applies to a large variety of distributions. The general statement (Theorem 2.2) is a bit technical, so we will first discuss a few corollaries concerning special cases.
The most famous special case is when the entries of are independent identically distributed (i.i.d.) Bernoulli random variables (taking values with probability ). The following conjecture has been open for quite some time:
For an by matrix with each entry an i.i.d. Bernoulli random variable taking the values and each with probability ,
It is easy to verify that the singularity probability is at least by considering the probability that there are two equal rows (or columns).
Even in the case of i.i.d. Bernoulli random variables, proving that the singularity probability is is not trivial. It was first done by Komlós in 1967 (see also ; generalizes Komlós’s bound to other integer distributions). The first exponential bound was proven by Kahn, Komlós, and Szemerédi , who showed that . This upper bound was improved upon by Tao and Vu in to . A more significant improvement was obtained by the same authors in :
This improvement was made possible through the discovery of a new theorem [11, Theorem 5.2] (which was called the Structure Theorem in ), which gives a complete characterization of a set with certain additive properties. The Structure Theorem (to be more precise, a variant of it) will play a critical role in the current paper as well.
Our general result has the following corollary in the Bernoulli case:
which gives a slight improvement over Inequality (1) (since ).
Let us now discuss a more general class of random matrices. Consider the random variable defined by
and let be an by matrix with each entry an independent copy of . The random variable plays an important role in , and the matrices are of interest in their own right. In fact, giving zero a large weight is a natural thing to do when one would like to (randomly) sparsify a matrix, a common operation used in randomized algorithms (the values of , as the reader will see, are not so critical). Our general result implies the following upper bounds:
Note that Inequality (5) implies Inequality (1) and that Inequality (6) implies Inequality (2) (in both cases setting ).
Figure 1 summarizes the upper bounds from Inequalities (4), (5), and (6) and also includes the following lower bounds:
These lower bounds can be derived by computing the probability that one row is all zeros (Inequality (7)) or that there is a dependency between two rows (Inequality (8)). Note that in the case where , the upper bound in Inequality (4) asymptotically equals the lower bound in Inequality (7), and thus our result is the best possible in this case. We also used a Maple program to derive the formulas for lower bounds resulting from a dependency between three, four, or five rows; however, these lower bounds were inferior to those in Inequality (7) and Inequality (8).
We will now present another corollary of the main theorem that has a somewhat different flavor. In this corollary, we treat partially random matrices, which may have many deterministic rows. Our method allows us to obtain exponential bounds so long as there are still at most random rows, where is a particular constant.
Notice that the case and also implies Inequality (2).
The focus of this paper is optimizing the base of the exponent in bounds on the singularity probability for discrete random matrices. One main tool in this optimization is the use of a structure theorem similar to [11, Theorem 5.2] (see Theorem 6.1 below); however, using such a theorem requires additional assumptions to be placed on the values that can appear as entries, and in particular, this is why we assume in Corollary 1.2 that the set has cardinality and that . If one is interested in an exponential bound where there are no conditions on or on the set (at the expense of having an unspecified constant for the base of the exponential), one can follow the analysis in , which does not make use of a structure theorem, along with ideas in this paper to get a result of the following form:
For every there exists such that the following holds. Let be an by complex matrix in which rows contain fixed, non-random entries and where the other rows contain entries that are independent discrete random variables. If the fixed rows have co-rank and if for every random entry , we have , then for all sufficiently large
Note that Theorem 1.4 holds for any and , and so in particular, an exponential bound on the singularity probability is achieved whenever and , where is a constant. Also note that the theorem allows the random entries to have discrete distributions taking infinitely many values. Corollary 3.6 proves a version of Theorem 1.4 with a much better exponential bound, given some additional conditions.
The structure of the rest of the paper is as follows. In Section 2 we define -bounded of exponent and state the main theorem of this paper. In Section 3, we discuss some corollaries of Theorem 2.2. In particular, we will:
prove general bounds on the singularity probability for discrete random matrices with entries that have symmetric distributions and with entries that have asymmetric distributions;
Prove a version of Corollary 1.2 (namely, Corollary 3.5) that holds for up to fixed rows, assuming that the entries in the fixed rows take integer values between and for any positive constant ; and
prove that the probability that random matrices with integer entries have a rational eigenvalue is exponentially small.
The general theorem
To prove the results in Inequalities (1) and (2) (and also the results in and ), one basic idea is to replace entries of a random matrix with independent copies of the random variable or (see Equation (3)). One key idea in proving the more general results of the current paper is replacing the entries of a random matrix with more complicated symmetric discrete random variables.
The following definition lies at the heart of our analysis.
Let be a positive constant such that and let be a positive integer constant. A random variable taking values in the integers (or, respectively, the integers modulo some large prime ) is -bounded of exponent if
and , and
We will define -bounded of exponent for collections of random variables below, but first we note that the conditions above are easy to verify in practice. In particular, if we have a symmetric random variable
where the equality on the right-hand side is a simple expected value computation.
We say that a collection of random variables is -bounded of exponent if each is -bounded of exponent with the same constants , , and ; and, importantly, the same value of . We also make the critical assumption that the set of all values that can be taken by the has cardinality (a relaxation of this assumption is discussed in Remark 8.5). However, the definition of is otherwise allowed to vary with and . Also, we will use to denote the set of all possible values taken by the random variables , and we will assume that the cardinality of is at most .
Let be a positive constant such that , let be a positive integer constant, and let be a generalized arithmetic progression in the complex numbers with rank (independent of ) and with cardinality at most . Let be an by matrix with entries , each of which is an independent random variable taking values in . If the collection of random variables is -bounded of exponent , then
In the motivating examples of Section 1 (excluding Corollary 1.2), we discussed the case where the entries of the matrix are i.i.d.; however, in general the distributions of the entries are allowed to differ (and even depend on ), so long as the entries all take values in the same structured set described above. The condition that has additive structure seems to be an artifact of the proof (in particular, at certain points in the proof of Theorem 6.1, we need the set to have cardinality at most ). The easiest way to guarantee that has the required structure is to assume that the set of values taken by all the has cardinality at most , and this is the approach we take for the corollaries in Section 3, since it also makes it easy to demonstrate that the collection of entries is -bounded of exponent .
Some corollaries of Theorem 2.2
In each corollary, we will use the definition of -bounded of exponent 1 and of exponent 2. The definition of -bounded of exponent 2 is particularly useful, since then the absolute value on the left-hand side of Inequality (10) is automatically dealt with; however, when is small (for example whenever ), one can get better bounds by using -bounded of exponent 1. We have not yet found an example where the best possible bound from Theorem 2.2 is found by using -bounded of an exponent higher than 2.
To prove Inequality (4), we note for that (using the definition in Equation (3) of )
and thus is -bounded of exponent 1 (i.e., take ), and so Inequality (4) follows from Theorem 2.2.
To prove Inequality (5), we note for that
(the inequality above may be checked by squaring both sides and expanding as polynomials in ). Thus, we can take
to see that is -bounded of exponent 1, and so Inequality (5) follows from Theorem 2.2.
To prove Inequality (6), we note for that
to see that is -bounded of exponent 2, and so Inequality (6) follows from Theorem 2.2.
2 Matrices with entries having symmetric distributions
In this subsection, we will prove a singularity bound for an by matrix for which each entry is a symmetric discrete random variable taking the value 0 with probability .
Let be a set of complex numbers with cardinality . If is an by matrix in which each entry is an independent symmetric complex random variable taking values in and taking the value 0 with probability , then
In particular, the same upper bounds as in Inequalities (4), (5), and (6) (which are shown in Figure 1) apply to the singularity probability for .
Let be an entry of . Since is symmetric and takes the value 0 with probability , we may write , where is an independent copy of as defined in Equation (3) and is a random variable that shares no values with . This description of was inspired by , and it allows us to condition on and then use the remaining randomness in to get a bound on the singularity probability. In particular,
where the sum runs over all -tuples of possible values taken by random variables . Since , we can complete the proof by proving an exponential bound on , and we will use Theorem 2.2 for this task.
We have thus shown that the entries of are
Applying Theorem 2.2 completes the proof. ∎
Corollary 3.1 is tight for , since the probability of a row of all zeroes occurring is ; however, for any specific case, Theorem 2.2 can usually prove better upper bounds than those given by Corollary 3.1.
For example, consider the case of a matrix with each entry an independent copy of the symmetric random variable
For as defined above, we have
By the definition of we have
(i.e., the right-hand side of the equation above is non-negative for such ), which proves the first bound.
for , which proves the second bound. ∎
We also have the following lower bounds for the singularity probability of :
The results of Corollary 3.2 and the corresponding lower bounds are shown in Figure 2, and one should note that the upper bounds are substantially better than those guaranteed by Corollary 3.1.
3 Random matrices with entries having arbitrary distributions
A useful feature of the definition of -bounded of exponent 2 is that it lets one bound the singularity probability of matrices with independent discrete random variables that are asymmetric.
We will need the following slightly more general corollary in Section 3.4. For a set and an integer , we will use the notation and .
Note that that Corollary 3.4 implies Corollary 3.3 by taking to be the matrix of all zeroes.
Let be an entry in . Our goal is to describe in a two-step random process, condition on one of the steps, and then use the randomness in the other step to bound the singularity probability. The conditioning approach is the same as that used in the symmetric case (Corollary 3.1) and was inspired by . The conditioning argument is useful since some entries of the random matrix may take some values with very small probability (i.e. probability less than any constant); recall that while the entries of the random matrix always take values in a fixed set of cardinality , the distributions of those random variables within are allowed to vary with . (Note that making use of Remark 8.5 would provide an alternate way of dealing with entries that take some values with very small probability.)
Say that takes the values with probabilities , respectively, where . Define new random variables such that for some and , the values taken by are with corresponding probabilities , where . Thus, we can write
As in the proof of Corollary 3.1, we will condition on the values taken by the in order to prove a bound on the singularity probability. We have that
where the sum runs over all possible values that the can take. Thus, it is sufficient to prove a bound on the singularity probability for the random matrix which has random entries
where and the are constants.
Combining Case 1 and Case 2, we have that the collection is -bounded of exponent 2, and so by and by Theorem 2.2 we have that .
The constant was chosen arbitrarily, and so letting tend to zero, we get that
4 Partially random matrices
In this subsection, we prove a bound on the singularity probability for partly random matrices where many rows are deterministic.
Corollary 3.5 applies to partly random matrices with fixed, non-random rows containing integers bounded by a constant and with random entries taking at most values in the complex numbers. Corollary 1.2, on the other hand, holds with the fixed entries also allowed to take values in the complex numbers and gives a sligtly better bound, but additionally requires (which is far smaller in general than ). Proving Corollary 1.2 requires mirroring the entire argument used to prove the main theorem (Theorem 2.2) in the case where rows contain fixed, non-random entires, and we discuss this argument in Section 9. Proving Corollary 3.5, however, can be done directly from Theorem 2.2, as we will show below. First, we will state a generalization of Corollary 3.5.
To obtain Corollary 3.6 from Corollary 3.5, find a collection of linearly independent rows among the deterministic rows. Replace the rest of the deterministic rows with a collection of rows containing integer values between and such that is linearly independent from . Finally, apply Corollary 3.5 to the new partially random matrix whose deterministic rows are from , thus proving Corollary 3.6.
By reordering the rows and columns, we may write
where is an by non-random invertible matrix, is an by non-random matrix, is an by random matrix, and is an by random matrix. Note that is singular if and only if there exists a vector such that . Let be the first coordinates of and let be the remaining coordinates. Then if and only if
Since is invertible, these two equations are satisfied if and only if , that is, if and only if the random matrix is singular.
We want to show that every entry that can appear in is an element of . By the cofactor formula for , we know that the entry of is , where is the by matrix formed by deleting the -th row and -th column of . Thus, , where the entry of is . By the volume formula for the determinant, we know that is at most the product of the lengths of the row vectors of ; and thus (here we need that has integer entries between and , where is a constant, and that ). Similarly, we have . Every entry of is thus in , every entry of is in , and every entry of is in ; thus, every entry of is an element of .
Conditioning on the values taken by all the entries in , we have
where the sum runs over all possible matrices that can produce. Considering the entries in to be fixed (note that and are fixed by assumption), we now need to bound
Note that every entry of is an element of and that the random matrix has entries that take values in the fixed set having cardinality . Thus, by Corollary 3.4, we have that
Plugging this bound back into Equation (14) completes the proof. ∎
5 Integer matrices and rational eigenvalues
Let be the random variable taking the values each with equal probability, and let be the by matrix where each entry is an independent copy of . In , Martin and Wong show that for any ,
where is a constant depending on and . (One goal in is to study this bound as goes to while is fixed, which is why is allowed to depend on .)
Below, we prove a similar result for random integer matrices with entries between and (with fixed), where we allow each entry to have a different (independent) distribution and we also allow the distributions to be very general.
Fix a positive integer , and let be a random integer matrix with independent entries, each of which takes values in the set . Let be a constant such that for every entry , we have . Then
where the term goes to zero as goes to .
For example, in the case where each independent entry has the uniform distribution on (as in ), one can set in the corollary above.
The proof given below follows the same outline as the main theorem of , with Corollary 1.2 replacing an appeal to [7, Lemma 1].
The characteristic polynomial for is monic with integer coefficients, and thus the only possible rational eigenvalues are integers (by the rational roots theorem). Every eigenvalue of has absolute value at most (see [7, Lemma 4]); thus, the only possible integer eigenvalues are between and .
The matrix has as an eigenvalue if and only if is singular (where is the by identity matrix). By Corollary 1.2 (with ), we have
Random matrices with complex entries: A reduction technique
the map is injective on , and
for any by matrix with entries , we have
Let be a constant such that and let be a characteristic zero integral domain. Let have cardinality . If is an by matrix with independent random entries, each taking values in , such that for every entry , we have , then
Proof of the main theorem (Theorem 2.2)
Let denote the -th row of . We note that has determinant zero if and only if there is a linear dependency among the rows . It has been shown (see [10, Lemma 5.1] and also ) that the dominant contribution to the singularity probability comes from the spanning a hyperplane (of dimension ). In particular,
where denotes the event that span , and non-trivial means that contains the origin, is spanned by vectors in (where is the set of all possible values that can occur in ), and for all .
As in , we will divide the non-trivial hyperplanes into classes, since it is then sufficient to show that the sum of over all in a particular class is at most .
For , we define to be the set of all non-trivial hyperplanes such that
We will refer to as the combinatorial dimension of .
Note that for (by Lemma B.1). We will consider hyperplanes with combinatorial dimension in three main regions: small, medium-sized, and large. The two lemmas and the proposition below suffice to prove Theorem 2.2.
The reasoning here is the same as in [11, Lemma 2.3], making use of fact that . In particular,
which completes the proof since the summing the right-hand side over all is at most (note that an instance of the vectors can span at most one hyperplane). ∎
where is the number of nonzero coordinates in the normal vector to . Combining the two inequalities above shows that .
(Lemma A.2 is a natural generalization of [4, Section 3.1]; see also , [10, Lemma 5.1], and [2, Lemma 14.10].) ∎
The results in and were derived using the ideas that we will use for the unexceptional medium combinatorial dimension case. The idea of considering the exceptional case separately in (and using tools from additive combinatorics in the exceptional case) is what lead to the improvement of Inequality (1), which gives a bound of asymptotically , over the bound in .
2 Proof of the medium combinatorial dimension
Before defining exceptional and unexceptional hyperplanes, we will need some new notation. By assumption, the collection of random variables is -bounded of exponent with a constant , with random variables corresponding to each , and with a constant (see Definition 2.1). We also define a constant slightly smaller than , namely . We will let denote another row vector that corresponds to the row vector ( comes from the definition of -bounded of exponent ). Also, we will let
Consider a hyperplane of medium combinatorial dimension (that is, satisfies the condition in Proposition 5.4). We say is unexceptional if there exists an where and there exists a where such that
We say is exceptional if for every where and for every where we have
Proposition 5.4 follows from the two lemmas below, so long as is chosen suitably small with respect to , , and .
We will prove Lemma 5.6 in Section 5.3, and we will prove Lemma 5.7 in Section 6.
3 The unexceptional medium combinatorial dimension case
The general idea for the case of an unexceptional hyperplane is to replace some of the rows in the matrix with rows that concentrate more sharply on the subspace . In the case where the exponent , replacing a row with is successful; however, in the exponent case, for example, replacing the entire row results in a bound that is off by an exponential factor. We solve this problem by replacing with only half of (with the other half of the entries being zero). This idea easily extends to any integer and is the motivation for defining the vectors to have all zeros except for roughly coordinates, as is done in Equation (17). The basic utility of (from Definition 5.5) is that it concentrates more sharply on the unexceptional subspace than the vector for any .
Let be the vector from the definition of unexceptional (Definition 5.5) such that for every , and set . Let be the closest integer to , where is a small positive absolute constant (for example, in , is taken to be ). Finally, let be copies of , independent of each other and of .
Let be the event that are linearly independent and lie in . Then,
The argument follows the same reasoning as [11, Lemma 4.4], however, the quantity in should be replaced by . Details are provided in Appendix B. ∎
To conclude the proof of Lemma 5.6, we follow the “row-swapping” argument at the end of [11, Section 4], with the small change of bounding by , which we use in place of the quantity . Details are provided in Appendix B.
Analyzing the exceptional medium combinatorial dimension case
The approach for exceptional in is very different from that used in the unexceptional case or in the large or small combinatorial dimension cases. Using some powerful tools from additive combinatorics, the general idea is to put exceptional hyperplanes in correspondence with a particular additive structure called a generalized arithmetic progression, and then to show that the number of the particular generalized arithmetic progressions that arise in this way is exceedingly small. The key to this approach is a structure theorem—namely, [11, Theorem 5.3]. In this section, we state a slightly modified structure theorem (Theorem 6.1), and then we show how to use Theorem 6.1 to prove Lemma 5.7. In the beginning of Section 7, we outline the changes needed to prove the the structure theorem for our current context, and in Sections 7 and 8 we provide details.
Before stating the structure theorem, we need some definitions and notation. A generalized arithmetic progression of rank is a set of the form
We will use the notation , where is a positive integer, to denote the set and the notation , where is a positive integer, to denote the set . If is a generalized arithmetic progression of rank , then so is , while , on the other hand, is a generalized arithmetic progression of rank at most . Also note that and that .
Given an exceptional hyperplane , there exists a representation of the form
and with the volume bound
(i) (Scaled defining coordinates lie in a progression) The symmetric generalized arithmetic progression
We will discuss the proof of the structure theorem in Sections 7 and 8. In the remainder of this section, we will discuss how to use the structure theorem to prove Lemma 5.7.
Fix of medium combinatorial dimension (see Proposition 5.4). Using independence of the rows, we have
In [11, Section 5], it is shown using Theorem 6.1(i) and (ii) and Gaussian-type methods (and the fact that is bounded by a constant) that
Plugging the volume bound on into the previous displayed inequality, we have
Halász-type arguments
The proof of the structure theorem has two main ingredients: tools from additive combinatorics, and Halász-type arguments using discrete Fourier analysis. Our proof of Theorem 6.1 will follow the proof of [11, Theorem 5.2] very closely. We will use results about additive combinatorics from [11, Section 6] directly, and we will discuss below the extent to which the Halász-type arguments of [11, Section 7] need to be modified to work for our current context. The proof of Theorem 6.1 will be given in Section 8 using results from the current section, [11, Section 6], [11, Section 7], and [11, Section 8]. Our Section 8 follows [11, Section 8] closely, with a few modifications to prove rational -commensurability instead of only rational commensurability.
In this section we discuss modifications to the lemmas in [11, Section 7] that are needed in order to prove Theorem 6.1.
We will use to denote the primitive character
where the last line is an application of Hölder’s inequality.
where , as defined in Section 5.2. Note that .
We will need the following analog of [11, Lemma 7.1]:
This inequality may be proven pointwise (for each after expanding out the definition of ) using the convexity of the function, just as in the proof of [11, Lemma 7.1] (see also [10, Lemma 7.1]. ∎
( since by Definition 2.1).
Thus, there is a constant such that
for every . (E.g., the constant suffices.)
There exists a constant depending on , and such that
Furthermore, for every integer we have
Our goal is to bound from above and below, and then pass to bounds on using the fact that for all .
We can choose sufficiently small with respect to and so that, for example,
for some constant . This completes the proof of Lemma 7.2. ∎
We now state and prove a lemma showing that is at least a constant for all . In , the lemma below is unnecessary because an inequality following from [11, Inequality (30)] (which corresponds to Inequality (30)) and the triangle inequality suffices.
Let and be defined as in Equation (28) and Equation (25), respectively. If , then
Note that is a constant.
Note that Inequality (29) implies that for any we have
Thus, by the triangle inequality, we have for any that
Fix . Let be the number of indices such that
and without loss of generality, say that these indices are . Squaring Inequality (34), we see that , and so we have
which is a constant. Thus, for the vast majority of the indices , namely , we have
Note that for all and that the triangle inequality holds: . We also have that
Thus, squaring Inequality (30) and summing over all , we have
In the next section, we will complete the proof of the structure theorem using the lemma above.
Proof of the Structure Theorem (Theorem 6.1)
The key to proving the structure theorem is an application of Freiman’s Theorem for finite fields.
The symmetric generalized arithmetic progression is close to what is needed for Theorem 6.1, since it satisfies the required volume and rank bounds. We will show below that can be altered in ways that preserve Inequalities (37) and (38) (except possibly for changing the implicit constants) so that satisfies conditions (i), (ii), and (iii) of Theorem 6.1.
There is an absolute constant such that the following holds. Let be a symmetric progression of rank in a abelian group , such that every nonzero element of has order at least . Then there exists a proper symmetric generalized arithmetic progression of rank at most containing such that
Furthermore, if is not proper and , then can be chosen to have rank an most
One can conclude Lemma 8.2 from the proof of [11, Lemma 9.3] (the only difference is noting that the rank can be reduced by at least 1 if is not proper to begin with). Note that we can always choose larger than .
Furthermore, if and if there is some for which is not ()-proper, then can be chosen to have rank at most .
We proceed by induction on the rank . For the base case, let and consider such that . Since has rank 1 in this case, we have that and . Combining these two equations we have , and dividing by (note that we may assume that ), we see that . Thus is -proper for every .
For , we may assume that there is some such that (i.e., we assume that is not -proper). We may assume that has the form . Let , and let denote the box .
Let be the largest integer such that , so and . Since and , we know that and ; and thus, . This shows that , which has dimensions , is not proper, since it has two distinct representations for .
We can now apply Lemma 8.2 to , thus finding a proper symmetric generalized arithmetic progression of rank at most containing (which contains ) such that
Since has rank at most , we have by induction that there exists a proper symmetric generalized arithmetic progression of rank at most containing and such that
and such that is ()-proper for every . Choosing (for example) guarantees that , which completes the induction. ∎
Since not all the are zero, we may assume that . Setting so that , we see that is contained in the symmetric generalized arithmetic progression
with rank , dimensions (which are the same as the corresponding dimensions for ), and basis vectors . By construction . ∎
If is proper, then do nothing; otherwise apply Lemma 8.2.
If satisfies the three properties given in steps 1, 2, and 3, halt; otherwise, return to step 1.
Thus, all that is left to prove is part (iii), the claim of rational -commensurability. Though we will not need it in the current section, one should recall that Theorem 6.1 is only useful when , where is the symmetric generalized arithmetic progression containing and all possible values taken by the and the (see Section 6).
We say that a set economically -spans a set if each can be represented as a highly -rational linear combination of elements in , where each coefficient may be expressed as where and where the implicit constants in the and notation are uniform over .
Comparing our definitions with those from [11, Section 8], we note that “highly rational” means the same thing as “highly -rational”, and “economically spans” means the same thing as “economically -spans”. Thus, it is clear that any highly rational number is also highly -rational for any containing , and also the statement “ economically spans ” implies “ economically -spans ” for any set containing . The remainder of this section paraphrases (with some notational changes) the latter portion of [11, Section 8].
Let be the smallest integer such that there exists a subset of cardinality of (by renumbering, say the set is ) so that for some nonzero and some we have
Note that the line above is the only place in the proof where we use the assumption from the definition of -bounded of exponent that the take values in a set with cardinality . As is evidenced here, the following weaker assumption suffices instead: say that for each there exists a set such that and such that each take a nonzero value in with probability at least . In fact, this weaker assumption also replaces the assumption in the definition of -bounded of exponent that for every : It suffices for each to take one value in with probability at least , instead of taking every value with probability at least .
Plugging this last equation into Equation (39), we arrive at
Thus, we have completed the proof of the structure theorem (Theorem 6.1).
A generalization: 𝔣𝔣\mathfrak{f} rows have fixed, non-random values
In this section, we will give a generalization of Theorem 2.2 to the case where the random matrix has rows that are assumed to be linearly independent and contain fixed, non-random entries. The proof of the generalized result is very similar to the proof of Theorem 2.2, and we will sketch the main differences in the two proofs below.
Let be an integer between 1 and , let be a subset of a ring, and let be an by matrix defined as follows. For and , let the entries of be fixed (non-random) elements of such that the rows for are linearly independent. For and , let the entries of be discrete finite random variables taking values in . Thus,
Let be a positive constant such that , let be a positive integer constant, and let be a generalized arithmetic progression in the complex numbers with rank (independent of ) and with cardinality at most . Consider the matrix with entries in (see Definition 9.1 above), where . If the collection of random variables is -bounded of exponent , then
Note that the bound on the singularity probability of for is the same as in Theorem 2.2 (since for , we have ). This is a reflection of the fact that only the large dimension case uses the randomness in all the rows simultaneously, and in that case the exponential bound does not depend on . Generally speaking, the best known lower bounds on the singularity probability of a discrete random matrix come from a dependency among at most two random rows, and since certainly has more than two random rows, the upper bounds given in Theorem 9.2 seem reasonable.
Theorem 9.2 leads to Corollary 1.2 by following a conditioning argument very similar to that given in Section 3.3.
The proof of Theorem 9.2 follows the same lines of reasoning as that of Theorem 2.2. In this subsection, we will state the main lemmas with the necessary modifications, and we will mention a few important considerations when making the modifications.
Note that Equation (15), which reduces the question of singularity to one of the rows spanning non-trivial hyperplane of dimension holds in the current context, using the same definition of and “non-trivial hyperplane” (both are defined after Equation (15) in Section 5.1).
For , we define to be the set of all non-trivial hyperplanes such that
We will refer to as the combinatorial dimension of .
The proof is the same as that for Lemma 5.2; also see , , . ∎
The proof is the same as that for Lemma 5.3, except now we appeal to Lemma A.2 with . Note that we must assume in order to apply Lemma A.2. See also ,,. ∎
Consider a hyperplane of medium combinatorial dimension (that is, satisfies the condition in Proposition 9.6). We say is unexceptional if there exists an where and there exists a where such that
We say is exceptional if for every where and for every where we have
If for some positive constant , then we have
The proof follows in the same way as that for Lemma 5.6; however, when replacing rows of with rows that concentrate more sharply on , we must take care to only replace random rows of (i.e., rows must not be replaced by ). See Appendix B for details. ∎
In the exceptional case, The same structure theorem (Theorem 6.1) holds, leading to the following lemma.
If , then
Note that this upper bound is dramatically worse than the analogous upper bound in Lemma 5.7 of .
As in Lemma 5.7, the main step in the proof is applying the structure theorem (Theorem 6.1). In the current context, Inequality (20) holds with as the exponent instead of (since there are only random rows). If we combine this modified version of Inequality (20) with Inequality (21), then we have the bound
Acknowledgments
We would like to thank Kevin Costello for helpful conversations on the conditioning argument in Subsections 3.2 and 3.3. Also the third author would like to thank the National Defense Science and Engineering Fellowship and the National Science Foundation Graduate Research Fellowship for helping fund this work.
Appendix A Two background results
For sufficiently large with respect to , , and , it is clear that we have
Our proof is closely modeled on the proof of [12, Corollary 7.13]. Let be the symmetric random variables from the definition of -bounded of exponent corresponding to (see Equation (9)). Then, we can compute
Combining the above inequalities with Inequality (44) and following the proof of [12, Corollary 7.13] to bound the integral, we have
A.2 A generalization of a lemma due to Komlós [6]
This lemma is a generalization of the result in (see also [2, Lemma 14.10], [4, Section 3.1], and [10, Lemma 5.3]).
where the constant can be taken to be , and where denotes the zero vector.
Let E_{k}=\{\mbox{there exists }v\in\Omega_{1}\mbox{ with at mostknonzero coordinates such that }N_{\mathfrak{f},n}\cdot v=0\}. Clearly,
Let be the set of all possible values that could appear as entries in , and let be the by matrix consisting of columns of . Following [6, Lemma 2] (see also [2, Lemma 14.10] and [10, Lemma 5.3]) we can write
Let be a uniform upper bound for \Pr(\mbox{rowiH}), where and is the constant from Definition 2.1 (here, we mean uniform with respect to the index sets and ). Then we have
since fixed rows of can span at most 1 hyperplane of dimension .
Summing the bounds in Inequalities (45) and (46) completes the proof. ∎
Appendix B The unexceptional case with 𝔣𝔣\mathfrak{f} fixed rows
This section is adapted from the proof of [11, Lemma 4.1], and proves Lemma 5.6 by setting . Assume that , and let be the closest integer to . Let be i.i.d. copies of the unexceptional row vector from Definition 9.7, so for all . We will need the following version of the Weighted Odlyzko Lemma:
[cf. [11, Lemma 4.3] or [4, Section 3.2]] For , let be an ()-dimensional subspace containing (which are fixed, linearly independent row vectors). Then
Since has dimension , there exists a set of “determining” coordinates such that if a vector , then the “determining” coordinates determine the values of the remaining coordinates. Since the maximum probability that any of the random coordinates in takes a given value is at most , and since there are at least of the random coordinates in that are not among the “determining” coordinates, we have the desired upper bound. ∎
Let , the space spanned by the fixed rows, and for let be the event that are linearly independent in . We have the following analog of Lemma 5.8 (and also [11, Lemma 4.4]):
Let , , and be as defined above. Then,
where denotes the full space of the . Conditioning on a particular instance of in , we have that
where denotes the ()-dimensional space spanned by and . We will now establish a uniform bound that does not depend on which particular instance of in that we fixed by conditioning. By the definition of unexceptional, we have
and by the Weighted Odlyzko Lemma (see Lemma B.1), we have
Using Taylor’s Theorem with remainder (for example), one can show that
and plugging this estimate back into Inequality (47) we get
To conclude Lemma 9.8 (which implies Lemma 5.6 by setting ), we will proceed as in the proof for [11, Lemma 4.1].
Let be i.i.d. copies of that are independent of the random rows . Using independence and Bayes’ Identity we have
Because the are linearly independent in , we know that there is a subset of cardinality , such that spans . Let be the event that spans . Then we have
Summing the above inequality over all unexceptional (note that ) and combining with the bound for above gives us
This completes the proof of the estimate for unexceptional .