Constructive noncommutative rank computation is in deterministic polynomial time

Gábor Ivanyos, Youming Qiao, K. V. Subrahmanyam

Introduction

This paper builds on the work reported in our previous paper [IQS17]. In the interest of keeping this paper self contained we introduce the problem again, recall its connections to invariant theory and operator theory, and describe recent progress on this problem including our work, [IQS17], the work of Garg, Gurvits, Oliviera and Wigderson [GGOW16], and that of Derksen and Makam [DM17b]. As a result this introduction overlaps with the introduction in [IQS17]. Readers who are familiar with [IQS17] can skip straight to 1.2 where we describe the new results in this paper.

Cohn showed that the non-commutative rank is not full if and only if there is a shrunk subspace [Coh95]. This was generalized by Fortin and Reutenauer [FR04, Theorem 1], where the authors showed

It follows that the non-commutative rank of the operator $T$ is a property of the matrix space $\mathcal{B}$ and does not depend upon its presentation $T$ . So it is natural to consider the problem of determining the maximum $c$ such that $\mathcal{B}$ has a $c$ -shrunk subspace.

Rank decreasing operator:

The null cone for the left right action:

Blow-ups:

We describe this algorithm in the next section. After that we describe further progress on the non-commutative rank from the works of Garg et al [GGOW16] and Derksen and Makam [DM17b]. We then state the main theorem of the paper.

1 Outline of the algorithm in [IQS17]

The algorithm in [IQS17] can be viewed as an analogue of the augmenting path algorithm for the bipartite maximum matching problem. However, due to the failure of the analogue of Hall’s marriage theorem in the matrix space setting, there are a couple of new and sophisticated components.

Let us briefly review some features of the augmenting path algorithm. Given a matching $T$ for the input bipartite graph $G=(L\cup R,E)$ , the algorithm tries to find an augmenting path for $T$ . If an augmenting path is found, $T$ is replaced by a larger matching $T^{\prime}$ . If no augmenting paths can be found, the algorithm can output a shrunk subset as the certificate of the maximality of $T$ .

The main ingredient of the algorithm in [IKQS15] was a method to find such a $B\in\mathcal{B}$ in certain special cases. The idea in [IQS17] is that, if we relax ourselves to work with ${\mathcal{B}}^{\{d\}}$ , then this can be achieved for every matrix space $\mathcal{B}$ .

We reproduce the constructive regularity lemma from [IQS17] below.

B: Iterating over.

2 Progress on non-commutative rank since 2015.

Recall that an important question was to upper bound $\sigma(R(n,m))$ , and exponential bounds were established in [Der01] and [IQS17]. These turned out to be sufficient for [GGOW16] to compute the non-commutative rank in deterministic polynomial time, over fields of characteristic zero, by a more refined analysis of Gurvits’ algorithm in [Gur04]. After [GGOW16], the following problems were still open:

a polynomial-time algorithm for the problem over finite fields, and

a search version of the problem, that is, explicitly exhibiting a matrix of rank $rd$ in the $d$ -th blow-up and a proof that the non-commutative rank is at most $r$ , even over fields of characteristic 0.

Recently, Derksen and Makam[DM17b] proved that it suffices to take the maximum over $d$ between $1$ and $n-1$ , for sufficiently large fields, by discovering a concavity property of blow-ups, and using the regularity lemma of blow-ups from [IQS17]. In the first version of this note, by showing that the concavity property can be made constructive, and building on the techniques from [IQS17], we obtained a deterministic polynomial-time algorithm for the non-commutative rank problem, which is constructive and works over large enough fields regardless of the characteristic. This answers the two open problems just mentioned.

After the first version of this note appeared on the arXiv, we discovered that a very simple observation already gives us the result, without having to use the results from Derksen and Makam. This argument also gives a different proof that the nullcone of the matrix semi-invariants is generated by polynomials in $R(n,m)$ of degree less than or equal to $O(n^{2})$ . We should point out that recently Derksen and Makam [DM17a] also gave a second proof of the regularity lemma. However their proof is not known to be constructive.

We now state our main result and the contributions of this paper.

If the constructivized version of Derksen and Makam [DM17b] is used, then in the above theorem we can improve the parameter slightly to $d\leq r-1$ instead of $d\leq r+1$ .

In particular, the non-commutative rank can be computed in deterministic polynomial time in positive characteristic as well, assuming that the ground field is sufficiently large.

Our result also settles a question of Gurvits [Gur04], asking if it is possible to decide efficiently, over fields of positive characteristic, whether or not there exists a non-singular matrix in a matrix space having the Edmonds-Rado property. Recall that a matrix space has the Edmonds-Rado property if it satisfies the promise that it either contains a non-singular matrix, or it shrinks some subspace. Since the algorithm in 1.5 efficiently tells whether the given matrix space has a shrunk subspace (e.g. the non-commutative rank is not full), it settles Gurvits’ question, when the field size is as stated in the hypothesis.

From the above, we have seen a polynomial upper bound on $\sigma(R(n,m))$ , and settled the non-commutative rank problem as well as SDIT for the Edmonds-Rado class, provided that the underlying field is large enough. However we can say more, even when the base field is a “too small” finite field.

Techniques.

As described in the the iterating over step in Section1.1, the algorithm in [IQS17] takes exponential time because we increase the blow-up size in an iterative way, and in each iteration the blow-up size is increased multiplicatively by the “scaled” rank. The key new insight is that we can keep the blow-up size small: when the scaled rank is $r$ , then the blow-up size can be brought back to $O(r)$ . As mentioned, we offer two methods to realize this reduction idea: a simpler method from us, and a method based on the technique of Derksen and Makam [DM17b].

Organization.

In Section 2 we first discuss algorithmic issues that arise when working over finite extensions of fields and how they are solved. Since all this appears with detailed proofs in our previous paper we only provide pointers to these issues and refer to [IQS17] for details. In Section 3 we give an efficient construction of cyclic field extensions of arbitrary degrees. In Section 4 we use this to prove the full regularity lemma. In Section 5.1 we prove the main Theorem 1.5 using our blow-up reduction method. In Section 5.2 we give the proof for Corollaryr̃efcor:small. Finally in Section 6 we show that the Derksen–Makam technique can be constructivized to provide another blow-up reduction method.

Preparations on certain algorithmic issues

In this section we highlight algorithmic issues which need to be addressed to ensure that our algorithms run in polynomial time. All these issues have been addressed in our earlier paper. So we only indicate briefly where these issues arise and what needs to be done. For details and proofs the really interested reader should refer to [IQS17].

Dealing with the need for a primitive root of unity.

Computing the rank of matrices over a rational function field in few variables.

Efficient construction of cyclic field extensions of arbitrary degrees

We consider the following $K_{0}$ -basis for $K_{s}$ :

The complete constructive regularity lemma

The proof makes use of the following two results from [IQS17].

Proof of the main theorem

In Section 5.1 we prove Theorem 1.5, and in Section 5.2 we deal with the small field case. The main drawback of our earlier algorithm discussed in Section 1.1 was that the blow-up size increases exponentially. However, a simple reduction procedure as described in Lemma 5.2 below readily implies that, once we find $A^{\prime}$ of rank $r^{\prime}d$ in ${\mathcal{B}}^{\{d\}}$ , we can efficiently reduce $d$ to be no more than $r^{\prime}+1$ . This means that we can always ensure that the blow-up factor is small, which is the key to reducing the complexity of the algorithm from exponential time to polynomial time. We shall make the above idea rigorous in the next subsection.

We first recall some preparation material from [IQS17].

The main technical ingredient of our algorithm is an improvement of [IQS17, Theorem 5.10], discussed in Section 2 . It states that either a shrunk subspace witnessing that the (scaled-down) rank of a matrix in a blow-up reaches the non-commutative rank or a matrix in a larger blow-up having larger scaled-down rank can be efficiently constructed. For completeness we give all the details and also the proof even though it is identical to that in our earlier paper excepting for the last step.

Starting with the kernel $V_{0}$ of the linear map $A$ we compute the image $W_{1}$ of $V_{0}$ under $\mathcal{A}$ . If $W_{1}$ is not in the image of $A$ we stop and declare $W^{*}=W_{1}$ . Otherwise we define $V_{1}$ to be the preimage of $W_{1}$ under $A$ and define $W_{2}$ to be the image of $V_{1}$ under $\mathcal{A}$ . We continue doing so, at each step checking if $W_{i}$ is in the image of $A$ or not. Since at each step the dimension of $W_{i}$ increases by $d$ it is clear that we halt in $l$ steps with $l$ at most $r+1$ , obtaining the limit subspace $W^{*}=W_{l}$ . If $W^{*}$ is in the image of $A$ , it follows from Fact 1.3 that the preimage of $W_{l}$ under $A$ is an $(n-r)d$ -shrunk subspace. In either case in at most $r+1$ steps we find a shrunk subspace or find that $W^{*}$ is not in the image of $A$ .

It is clear that the matrices $B_{1},\ldots,B_{l}$ as well as $\mu$ can be determined in the given polynomial time. ∎

To obtain the algorithm for Theorem 1.5, the regularity lemma needs to be accompanied with a reduction procedure that keeps the blow-up parameter small. We mentioned in the introduction that there are two methods for this purpose, and in this section we use our method. The method based on the Derksen-Makam technique is presented in Section 6.

2 Proof of Corollary 1.7: the case of small finite fields

We only need to prove Corollary 1.7 (2), from which (1) and (3) are immediate.

Constructivizing the result of Derksen and Makam

When (5) is violated then we can apply 4.1.

And here is essentially Proposition 2.10 of [DM17b]. We include a proof (which is almost literally the same as the proof in [DM17b]) here for completeness. We note that this lemma deals only with the property of certain families of functions, without referring to matrices.

By $r(d+1,d+1)=n(d+1)$ , for $1\leq a<d+1$ ,

As by assumption $r(d+1,a)\leq an$ , we have $r(d+1,a)=an$ . Similarly $r(a,d+1)=an$ for $1\leq a<d+1$ .

From $d\geq n-1$ it is inferred easily that $-n+\frac{n}{d}>-d$ . Therefore $nd-n+\frac{n}{d}>(n-1)d$ . By (5) we conclude that $r(d,d)=nd$ . ∎

We finally remark that, if we use Lemma 6.1 in the proof of Theorem 1.5, then $n$ in the statement of the lemma will be $r+1$ , $N$ will be $d^{\prime}$ , $M_{0}(d^{\prime},d^{\prime})$ is the nonsingular $(r+1)d^{\prime}\times(r+1)d^{\prime}$ block of $A$ and $M_{0}(p,q)$ can be actually even the zero matrix for $(p,q)\neq(d^{\prime},d^{\prime})$ . It will prepare matrices in several not necessarily square blow-ups, among others, most importantly, one in an $(r,r)$ -blow-up with a similar content as described in the proof of Theorem 1.5.

We would like to thank the authors of [GGOW16] and of [DM17b] for sharing their ideas with us and making us possible to read early versions of their manuscripts. Part of the work was done when Gábor and Youming were visiting the Centre for Quantum Technologies at the National University of Singapore. Research of the first author was also supported in part by the Hungarian National Research, Development and Innovation Office – NKFIH Grant 115288. Youming’s research was supported by the Australian Research Council DECRA DE150100720. KV’s research was supported by a grant from the Infosys foundation.