Randomized Composable Core-sets for Distributed Submodular Maximization

Vahab Mirrokni, Morteza Zadimoghaddam

Introduction

An effective way of processing massive data is to first extract a compact representation of the data and then perform further processing only on the representation itself. This approach significantly reduces the cost of processing, communicating and storing the data, as the representation size can be much smaller than the size of the original data set. Typically, the representation provides a smooth tradeoff between its size and the representation accuracy. Examples of this approach include techniques such as sampling, sketching, (composable) core-sets and mergeable summaries. Among these techniques, the concept of composable core-sets has been employed in several distributed optimization models such as nearest neighbor search , and the streaming and MapReduce models . Roughly speaking, the main idea behind this technique is as follows: First partition the data into smaller parts. Then compute a representative solution, referred to as a core-set, from each part. Finally, obtain a solution by solving the optimization problem over the union of core-sets for all parts. While this technique has been successfully applied to diversity maximization and clustering problems , for coverage and submodular maximization problems, impossibility bounds are known for this technique .

In this paper, we focus on efficient construction of a randomized variant of composable core-sets where the above idea is applied on a random clustering of the data. We employ this technique for the coverage, monotone and non-monotone submodular problems. Our results significantly improve upon the hardness results for non-randomized core-sets, and imply improved results for submodular maximization in a distributed and streaming settings. The effectiveness of this technique has been confirmed empirically for several machine learning applications , and our proof provides a theoretical foundation to this idea. Let us first define this concept, and then discuss its applications, and our results.

Here, we discuss the formal problem definition, and the distributed model motivating it.

Given an integer size constraint kk, we let fkf_{k} be

where the expectation is taken over the random choice of {T1,T2,,Tm}\{T_{1},T_{2},\ldots,T_{m}\}. For brevity, instead of saying that ALG\mathsf{ALG} implements a composable core-set, we say that ALG\mathsf{ALG} is an α\alpha-approximate randomized composable core-set.

For ease of notation, when it is clear from the context, we may drop the term composable, and refer to composable core-sets as core-sets. Throughout this paper, we discuss randomized composable core-sets for the submodular maximization problem with a cardinality constraint kk.

Distributed Approximation Algorithm. Note that we can use a randomized α\alpha-approximate composable core-set algorithm ALG\mathsf{ALG} to design the following simple distributed (11e)α(1-{1\over e})\alpha-approximation algorithm for monotone submodular maximization:

Each machine ii computes a randomized composable core-set SiTiS_{i}\subseteq T_{i} of size kk^{\prime}, i.e., Si=ALG(Ti)S_{i}=\mathsf{ALG}(T_{i}) for each 1im1\leq i\leq m.

In the second phase, first collect the union of all core-sets, U=1imSiU=\cup_{1\leq i\leq m}S_{i}, on one machine. Then apply a post-processing (11e)(1-{1\over e})-approximation algorithm (e.g., algorithm Greedy\mathsf{Greedy}) to compute a solution SS to the submodular maximization problem over the set UU. Output SS.

It follows from the definition of the α\alpha-approximate randomized composable core-set that the above algorithm is a distributed (11e)α(1-{1\over e})\alpha-approximation algorithm for submodular maximization problem. We refer to this two-phase algorithmic approach as the distributed algorithm, and the overall approximation factor of the distributed algorithm as the distributed approximation factor. For all our algorithms in this paper, in addition to presenting an algorithm that achieves an approximation factor α\alpha as a randomized composable core-set, we propose a post-processing algorithm for the second phase, and present an improved analysis that achieves much better than (11e)α(1-{1\over e})\alpha-approximation as the distributed approximation factor.

Note that the above algorithm can be implemented in a distributed manner only if kk^{\prime} is small enough such that mkmk^{\prime} items can be processed on one machine. In all our results the size of the composable core-set, kk^{\prime}, is a function of the cardinality constraint, kk: In particular, in Section 2, we apply a composable core-set of size k=kk^{\prime}=k. In Section 4, we apply a composable core-set of size k<4kk^{\prime}<4k, and as a result, achieve a better approximation factor. We call a core-set, a small-size core-set, if its size kk^{\prime} is less than kk (See Section 5). As we will see, the hardness results for small-size core-sets are much stronger than that of core-sets of size kk or larger.

Non-randomized Composable Core-sets. The above definition for randomized composable core-sets is introduced in this paper. Prior work define a non-randomized variant of composable core-sets where the above property holds for any (arbitrary) partitioning {T1,T2,,Tm}\{T_{1},T_{2},\ldots,T_{m}\} of data into mm parts It is not hard to see that for non-randomized composable core-set, the multiplicity parameter CC is not relevant., i.e., an algorithm ALG\mathsf{ALG} as described above is a α\alpha-approximate (non-randomized) composable core-set of size kk^{\prime} for ff, if for any cardinality constraint kk, and any arbitrary partitioning {T1,T2,,Tm}\{T_{1},T_{2},\ldots,T_{m}\} of the items into mm sets, we have fk(ALG(T1)ALG(Tm))αfk(T1Tm)f_{k}(\mathsf{ALG}(T_{1})\cup\ldots\cup\mathsf{ALG}(T_{m}))\geq{\alpha}\cdot f_{k}(T_{1}\cup\ldots\cup T_{m}).

2 Applications and Motivations

An α\alpha-approximate randomized composable core-set of size k=O(k)k^{\prime}=O(k) for a problem can be applied in three types of applications These results assume kn1ϵk\leq n^{1-\epsilon} for a constant ϵ\epsilon.: (i) in distributed computation , where it implies an α\alpha-approximation in one or two rounds of MapReduces using the total communication complexity of O(n)O(n), (ii) in the random-order streaming model, where it implies an α\alpha-approximation algorithm in one pass using sublinear memory, (iii) in a class of approximate nearest neighbor search problems, where it implies an α\alpha-approximation algorithm based on the locally sensitive hashing (under an assumption). Here, we discuss the application for the MapReduce and Streaming framework, and for details of the approximate nearest neighbor application, we refer to .

We first show how to use a randomized composable core-set of size O(k)O(k) to design a distributed algorithm in one or two rounds of MapReduces The straightforward way of applying the ideas will result in two rounds of MapReduce. However, if we assume that the data is originally sharded randomly and each part is in a single shard, and the memory for each machine is more than the size of each shard, then it can be implemented via one round of MapReduce computation. using linear total communication complexity: Let m=n/km=\sqrt{n/k}, and let (T1,,Tm)(T_{1},\ldots,T_{m}) be a random partitioning where TiT_{i} has kn\sqrt{kn} items. In the distributed algorithm, we assume that the random partitioning is produced in one round of MapReduce where each of mm reducers receives TiT_{i} as input, and produces a core-set SiS_{i} for the next round. Alternatively, we may assume that the data (or the items) are distributed uniformly at random among machines, or similarity each of mm mappers receives TiT_{i} as input, and produces a core-set SiS_{i} for the reducer. In either case, the produced core-sets are passed to a single reducer in the first or the second round. The total input to the reducer, i.e., the union of the core-sets, is of size at most mk=O(k)n/k=O(kn)mk^{\prime}=O(k)\sqrt{n/k}=O(\sqrt{kn}). The solution computed by the reducer for the union of the core-sets is, by definition, a good approximation to the original problem. It is easy to see that the total communication complexity of this algorithm is O(n)O(n), and this computation can be performed in one or two rounds as formally defined in the MapReduce computation model .

Next, we elaborate on the application for a streaming computation model: In the random-order data stream model, a random sequence of nn data points needs to be processed “on-the-fly” while using only limited storage. An algorithm for a randomized composable core-set can be easily used to obtain an algorithm for this setting The paper introduced this approach for the special case of kk-median clustering. More general formulation of this method with other applications appeared in .. In particular, if a randomized composable core-set for a given problem has size kk, we start by dividing the random stream of data into n/k\sqrt{n/k} blocks of size s=nks=\sqrt{nk}. This way, each block will be a random subset of items. The algorithm then proceeds block by block. Each block is read and stored in the main memory, its core-set is computed and stored, and the block is deleted. At the end, the algorithm solves the problem for the union of the core-sets. The whole algorithm takes only O(kn)O(\sqrt{kn}) space. The storage can be reduced further by utilizing more than one level of compression, at the cost of increasing the approximation factor.

Variants of the composable core-set technique have been applied for optimization under MapReduce framework . However, none of these previous results formally study the difference between randomized and non-randomized variants and in most cases, they employ non-randomized composable core-sets. Indyk et al. observed that the idea of non-randomized composable core-sets cannot be applied to the coverage maximization (or more generally submodular maximization) problems. In fact, all our hardness results also apply to a class of submodular maximization problems known as the maximum kk-coverage problems, i.e., given a number kk, and a family of subsets A2X{\cal A}\subset 2^{X}, find a subfamily of kk subsets A1,,AkA_{1},\ldots,A_{k} whose union j=1kAj\cup_{j=1}^{k}A_{j} is maximized. Solving max kk-coverage and submodular maximization in a distributed manner have attracted a significant amount of research over the last few years . Other than the importance of these problems, one reason for the popularity of this problem in this context is the fact that its approximation algorithm is algorithm Greedy\mathsf{Greedy} which is naturally sequential and it is hard to parallelize or implement in a distributed manner.

3 Our Contributions

Our results are summarized in Table 1. As our first result, we prove that a family of efficient algorithms including a variant of algorithm Greedy\mathsf{Greedy} with a consistent tie-breaking rule leads to an almost 1/31/3-approximate randomized composable core-set of size kk for any monotone submodular function and cardinality constraint kk with multiplicity of 11 (see Section 2). This is in contrast to a known O(logkk)O({\log k\over\sqrt{k}}) hardness result for any (non-randomized) composable core-set , and shows the advantage of using the randomization here. Furthermore, by constructing this randomized core-set and applying algorithm Greedy\mathsf{Greedy} afterwards, we show a 0.270.27 distributed approximation factor for the monotone submodular maximization problem in one or two rounds of MapReduces with a linear communication complexity. Previous results lead to algorithms with either much larger number of rounds of MapReduce , and/or larger communication complexity . This improvement is important, since the number of rounds of MapReduce computation and communication complexity are the most important factors in determining the performance of a MapReduced-based algorithm . The effectiveness of using this technique has been confirmed empirically by Mirzasoleiman et al who studied a similar algorithm on a subclass of submodular maximization problems. However, they only provide provable guarantees for a subclass of submodular functions satisfying a certain Lipchitz condition . Our result not only works for monotone submodular functions, but also extends to non-monotone (non-negative) submodular functions, and leads to the first constant-round MapReduce-based constant-factor approximation algorithm for non-monotone submodular maximization (with O(n)O(n) total communication complexity and approximation factor of 0.180.18). It also leads to the first constant-factor approximation algorithm for non-monotone submodular maximization in a random-order streaming model in one pass with sublinear memory.

Our next goal is to improve the approximation factor of the above algorithm for monotone submodular functions. To this end, we first observe that one cannot achieve a better than the 1/21/2 factor via core-sets of size kk using algorithm Greedy\mathsf{Greedy} or any algorithm in a family of local search algorithms. In Section 4, we show how to go beyond the 1/21/2-approximation by applying core-sets of size higher than kk but still of size O(k)O(k), and prove that algorithm Greedy\mathsf{Greedy} with a consistent tie-breaking rule provides a 0.5850.585-approximate randomized composable core-set of size k<4kk^{\prime}<4k for our problem. We then present algorithm PseudoGreedy\mathsf{PseudoGreedy} that can be applied as a post-processing step to design a distributed 0.5450.545-approximation algorithm in one or two rounds of MapReduces, and with linear total communication complexity. For monotone submodular maximization, this result implies the first distributed approximation algorithm with approximation factor better than 1/21/2 that runs in a constant number of rounds. We achieve this approximation factor using one or two rounds of MapReduces and with the total communication complexity of O(n)O(n). In addition, this result implies the first approximation algorithm beating the 1/21/2 factor for the random-order streaming model with constant number of passes on the data and sublinear memory. To complement this result, we first show that our analysis for algorithm Greedy\mathsf{Greedy} is tight. Moreover, we show that it is information theoretically impossible to achieve an approximation factor better than 11/e1-{1/e} using a core-set with size polynomial in kk.

Finally, we consider the construction of small-size core-sets, i.e., a core-set of size k<kk^{\prime}<k. Studying such core-sets is important particularly for cases with large parameter kk, e.g., k=Ω(n)k=\Omega(n) or k=nlognk={n\over\log n} For such large kk, a core-set of size kk may not be as useful since outputting the whole core-set may be impossible. For example, in the formal MapReduce model , outputting a core-set of size kk for k=Ω(n)k=\Omega(n) is not feasible.. For our problem, we first observe a hardness bound of O(kk)O({k^{\prime}\over k}) for non-randomized core-sets. On the other hand, in Subsection 5.2, we show an Ω(kk)\Omega(\sqrt{k^{\prime}\over k})-approximate randomized composable core-set for this problem, and accompany this result by a matching hardness bound of O(kk)O(\sqrt{k^{\prime}\over k}) for randomized composable core-set. The hardness result is presented in Subsection 5.1.

4 Other Related Work.

Submodular Maximization in Streaming and MapReduce: Solving max kk-coverage and submodular maximization in a distributed manner have attracted a significant amount of research over the last few years . From theoretical point of view, for the coverage maximization problem, Chierchetti et al. present a (11/e)(1-1/e)-approximation algorithm in polylogarithmic number of MapReduce rounds, and Belloch et al improved this result and achieved log2n\log^{2}n number of rounds. Recently, Kumar et al. present a (11/e)(1-1/e)-approximation algorithm using a logarithmic number of rounds of MapReduces. They also derive (1/2ϵ)(1/2-{\epsilon})-approximation algorithm that runs in O(1δ)O({1\over\delta}) number of rounds of MapReduce (for a constant δ\delta), but this algorithm needs a logn\log n blowup in the communication complexity. As observed in various empirical studies , the communication complexity and the number of MapReduce rounds are important factors in determining the performance of a MapReduce-based algorithm and a logn\log n blowup in the communication complexity can play a crucial role in applicability of the algorithm in practice. Our algorithm on the other hand runs only in (one or) two rounds, and can run on any number of machines as long as they can store the data, i.e, it needs mm machines each with memory proportional to 1m{1\over m} of the size of the input. One previous attempt to apply the idea of core-sets for submodular maximization is by Indyk et al. who rule out the applicability of non-randomized core-sets by showing a hardness bound of O(logkk)O({\log k\over\sqrt{k}}) for non-randomized core-sets. The most relevant previous attempt in applying randomized core-sets to submodular maximization is by Mirzasoleiman et al , where the authors study a class of algorithms similar to the ones discussed here, and show the effectiveness of applying algorithm Greedy\mathsf{Greedy} over a random partitioning empirically for several machine learning applications. The authors also prove theoretical guarantees for algorithm Greedy\mathsf{Greedy} for special classes of submodular functions satisfying a certain Lipschitz condition . Here, on the other hand, we present guaranteed approximation results for all monotone and non-monotone submodular functions. In fact, Ashwinkumar and Karbasi observed that if one applies the greedy algorithm without a consistent tie-breaking rule, the approximation factor of the algorithm is not bounded for coverage functions. In this paper, we prove that a class of algorithms including greedy with a consistent tie-breaking rule provide a guaranteed 0.270.27-approximation algorithm for all monotone submodular functions. We also show how to achieve an improved 0.540.54-approximation using slightly larger core-sets of size O(k)O(k). Finally, there is a recent paper in the streaming model in which the authors present a streaming 1/21/2-approximation algorithm with one pass and linear memory. Our results lead to improved results for random-order streaming model for monotone and non-monotone submodular maximization.

Core-sets: The notion of core-sets has been introduced in . In this paper, we use the term core-sets to refer to “composable core-sets” which was formally defined in a recent paper . This notion has been also implicitly used in Section 5 of Agarwal et al. where the authors specify composability properties of ϵ\epsilon-kernels (a variant of core-sets). The notion of (composable) core-sets are also related to the concept of mergeable summaries that have been studied in the literature . As discussed before, the idea of using core-sets has been applied either explicitly or implicitly in the streaming model and in the MapReduce framework . Moreover, notions similar to randomized core-sets have been studied for random-order streaming models . Finally, the idea of random projection in performing algebraic projections can be viewed as a related topic but it does not discuss the concept of composability over a random partitioning.

5 More Notation

Consider a submodular set function ff. Let ALG\mathsf{ALG} be an algorithm that given any subset TT returns subset ALG(T)T{\mathsf{ALG}}(T)\subseteq T with size at most kk^{\prime}. We say that this Algorithm ALG\mathsf{ALG} is a β\beta-nice algorithm for function ff and some parameter β\beta iff for any set TT and any item xTALG(T)x\in T\setminus{\mathsf{ALG}}(T) (item xx is in set TT but is not selected in the output of algorithm ALG\mathsf{ALG}), then the following two properties hold:

Set ALG(T{x}){\mathsf{ALG}}(T\setminus\{x\}) is equal to ALG(T){\mathsf{ALG}}(T), i.e., intuitively the output of the algorithm should not depend on the items it does not select, and

Δ(x,ALG(T))\Delta(x,{\mathsf{ALG}}(T)) is at most βf(ALG(T))k\beta\frac{f({\mathsf{ALG}}(T))}{k^{\prime}}. In other words, the marginal ff value of any not-selected item cannot be more than β\beta times the average contribution of selected items.

Randomized Core-sets for Submodular Maximization

In this section, we show that a family of β\beta-nice algorithms, introduced in Section 1.5, leads to a constant-factor approximate randomized composable core-set, and a constant-factor distributed approximation algorithm for monotone and non-monotone submodular maximization problems with cardinality constraints. Later, in Subsection 2.2, we show that several efficient algorithms in the literature of submodular maximization are β\beta-nice for some β[1,1+ϵ]\beta\in[1,1+\epsilon] (for ϵ=o(1)\epsilon=o(1)) including some variant of algorithm Greedy\mathsf{Greedy} with a consistent tie-breaking rule, and also an almost linear-time algorithm in . Before stating the theorem, we emphasize that in this section, we apply a composable core-set of multiplicity 1 which corresponds to a random partitioning of items into mm disjoint pieces.

For any β>0\beta>0, any β\beta-nice algorithm ALG\mathsf{ALG} is a 12+β\frac{1}{2+\beta}-approximate randomized composable core-set of multiplicity 1 and size kk for the monotone, and 11m2+β\frac{1-\frac{1}{m}}{2+\beta}-approximate for non-monotone submodular maximization problems with cardinality constraint kk.

We want to show that there exists a subset of i=1mSi\cup_{i=1}^{m}S_{i} with size at most kk, and at least an expected ff value of f(\textscOPT)2+β\frac{f(\textsc{OPT}{})}{2+\beta} for monotone ff, and (11/m)f(\textscOPT)2+β\frac{(1-1/m)f(\textsc{OPT}{})}{2+\beta} for non-monotone ff. Toward this goal, we take the maximum of max1imf(Si)\max_{1\leq i\leq m}f(S_{i}) and f(\textscOPTS)f(\textsc{OPT}{}^{S}) as a candidate solution. We define some notation to simplify the rest of the proof. Consider an arbitrary permutation π\pi on items of OPT, and for each item x\textscOPTx\in\textsc{OPT}{} let \textscOPTx\textsc{OPT}{}^{x} be the set of items in π\pi that appear before xx. We will first lower bound f(\textscOPTS)f(\textsc{OPT}{}^{S}) using the submodularity property in Lemma 2.2.

For any set of selected items, f(\textscOPTS)f(\textscOPT)x\textscOPT(i=1mSi)Δ(x,\textscOPTx)f(\textsc{OPT}{}^{S})\geq f(\textsc{OPT}{})-\sum_{x\in\textsc{OPT}{}\setminus(\cup_{i=1}^{m}S_{i})}\Delta(x,\textsc{OPT}{}^{x}) for a monotone or non-monotone submodular function ff.

First we note that f(\textscOPT)f(\textscOPTS)=x\textscOPT\textscOPTSΔ(x,\textscOPTx\textscOPTS)f(\textsc{OPT}{})-f(\textsc{OPT}{}^{S})=\sum_{x\in\textsc{OPT}{}\setminus\textsc{OPT}{}^{S}}\Delta(x,\textsc{OPT}{}^{x}\cup\textsc{OPT}{}^{S}). Using submodularity property, we know that Δ(x,\textscOPTx\textscOPTS)\Delta(x,\textsc{OPT}{}^{x}\cup\textsc{OPT}{}^{S}) is at most Δ(x,\textscOPTx)\Delta(x,\textsc{OPT}{}^{x}) because \textscOPTx\textsc{OPT}{}^{x} is a subset of \textscOPTx\textscOPTS\textsc{OPT}{}^{x}\cup\textsc{OPT}{}^{S}. Therefore f(\textscOPT)f(\textscOPTA)x\textscOPT\textscOPTSΔ(x,\textscOPTx)f(\textsc{OPT}{})-f(\textsc{OPT}{}\cap A)\leq\sum_{x\in\textsc{OPT}{}\setminus\textsc{OPT}{}^{S}}\Delta(x,\textsc{OPT}{}^{x}) which is equal to x\textscOPT(i=1mSi)Δ(x,\textscOPTx)\sum_{x\in\textsc{OPT}{}\setminus(\cup_{i=1}^{m}S_{i})}\Delta(x,\textsc{OPT}{}^{x}) by definition of \textscOPTS\textsc{OPT}{}^{S}. This concludes the proof. ∎

Lemma 2.2 suggests that we should upper bound x\textscOPT(i=1mSi)Δ(x,\textscOPTx)\sum_{x\in\textsc{OPT}{}\setminus(\cup_{i=1}^{m}S_{i})}\Delta(x,\textsc{OPT}{}^{x}) which is done in the next lemma.

The sum i=1mx\textscOPTTiSiΔ(x,\textscOPTx)\sum_{i=1}^{m}\sum_{x\in\textsc{OPT}{}\cap T_{i}\setminus S_{i}}\Delta(x,\textsc{OPT}{}^{x}) is at most β(max1imf(Si))+\beta\left(\max_{1\leq i\leq m}f(S_{i})\right)+ i=1mx\textscOPTTiSi\sum_{i=1}^{m}\sum_{x\in\textsc{OPT}{}\cap T_{i}\setminus S_{i}} (Δ(x,\textscOPTx)Δ(x,\textscOPTxSi))\left(\Delta(x,\textsc{OPT}{}^{x})-\Delta(x,\textsc{OPT}{}^{x}\cup S_{i})\right) for a monotone or non-monotone submodular function ff.

The sum i=1mx\textscOPTTiSiΔ(x,\textscOPTx)\sum_{i=1}^{m}\sum_{x\in\textsc{OPT}{}\cap T_{i}\setminus S_{i}}\Delta(x,\textsc{OPT}{}^{x}) can be written as:

The first term in the sum is upper bounded by βf(Si)k\beta\frac{f(S_{i})}{k} using the second property of β\beta-nice algorithms. To conclude the proof, we apply inequality f(Si)max1imf(Si)f(S_{i})\leq\max_{1\leq i^{\prime}\leq m}f(S_{i^{\prime}}), and use the fact that there are at most kk items in \textscOPT\textscOPTS=i=1m(\textscOPTTiSi)\textsc{OPT}{}\setminus\textsc{OPT}{}^{S}=\cup_{i=1}^{m}(\textsc{OPT}{}\cap T_{i}\setminus S_{i}). ∎

At this stage of the analysis, we use randomness of partition {Ti}i=1m\{T_{i}\}_{i=1}^{m} to upper bound the expected value of these differences in Δ\Delta values with the expected value of average of f(Si)f(S_{i}). This is stated in the following lemma:

Before proving the lemmas, we observe that putting these three lemmas together, we can finish the proof of the theorem. In particular, for a monotone or non-monotone submodular function ff, we have that:

Proof of Lemma 2.4. The main part of the proof is to show that the sum of the Δ\Delta differences in the statement of the lemma is in expectation at most 1m\frac{1}{m} fraction of sum of Δ\Delta differences for a larger set of pairs (i,x)(i,x). In particular, we show that

In Theorem 2.1, we prove that if on each part (set TiT_{i}) of the partitioning, we run a β\beta-nice algorithm ALG\mathsf{ALG}, the union of output sets of ALG\mathsf{ALG} will contain a set of size at most kk that preserves at least 12+β\frac{1}{2+\beta} fraction of value of optimum set. If we run algorithm Greedy\mathsf{Greedy} on the union of output sets i=1mSi\cup_{i=1}^{m}S_{i}, using the classic analysis of algorithm Greedy\mathsf{Greedy}, we can easily claim that the overall value of the output set at the end is at least 11/e2+β\frac{1-1/e}{2+\beta} fraction of f(\textscOPT)f(\textsc{OPT}{}). In Theorem 2.5, we show an improved distributed approximation factor.

Let SS be the output of algorithm Greedy\mathsf{Greedy} over i=1mSi\cup_{i=1}^{m}S_{i}, i.e. S=Greedy(i=1mSi)S=\mathsf{Greedy}(\cup_{i=1}^{m}S_{i}) for a monotone submodular function ff. Also let SS be the output of non-monotone submodular maximization algorithm of Buchbinder et al. on i=1mSi\cup_{i=1}^{m}S_{i} when ff is a non-monotone submodular function. The expected value of max{f(S),maxi=1m{f(Si)}}\max\{f(S),\max_{i=1}^{m}\{f(S_{i})\}\} is at least (11/e)f(\textscOPT)1+(11/e)(1+β)\frac{(1-1/e)f(\textsc{OPT}{})}{1+(1-1/e)(1+\beta)} for monotone and (11m)f(\textscOPT)/e1+(1+β)/e\frac{(1-\frac{1}{m})f(\textsc{OPT}{})/e}{1+(1+\beta)/e} for non-monotone submodular ff. In particular, for β=1\beta=1, the distributed approximation factors are 0.27\geq 0.27, and 0.2114m\geq 0.21-\frac{1}{4m} for monotone and non-monotone ff respectively.

By applying lemmas 2.2, 2.3, and 2.4, we have that: f(\textscOPTS)f(\textscOPT)βmaxi=1mf(Si)i=1mf(Si)mf(\textscOPT)(1+β)maxi=1mf(Si)f(\textsc{OPT}{}^{S})\geq f(\textsc{OPT}{})-\beta\max_{i=1}^{m}f(S_{i})-\frac{\sum_{i=1}^{m}f(S_{i})}{m}\geq f(\textsc{OPT}{})-(1+\beta)\max_{i=1}^{m}f(S_{i}) for a monotone submodular function ff. Using the classic analysis of Greedy\mathsf{Greedy} on submodular maximization in , one can prove that f(S)(11/e)f(\textscOPTS)f(S)\geq(1-1/e)f(\textsc{OPT}{}^{S}) when ff is monotone. By taking the expectation of the two sides of this inequality, and using Lemma 2.4, we have that:

If ff is non-monotone, we get a weaker inequality f(S)f(\textscOPTS)ef(S)\geq\frac{f(\textsc{OPT}{}^{S})}{e} by applying the algorithm of Buchbinder et al. . Using lemmas 2.2, 2.3, and 2.4, we have that: f(\textscOPTS)f(\textscOPT)βmaxi=1mf(Si)i=1mf(Si)mf(\textscOPT)m(11m)f(\textscOPT)(1+β)maxi=1mf(Si)f(\textsc{OPT}{}^{S})\geq f(\textsc{OPT}{})-\beta\max_{i=1}^{m}f(S_{i})-\frac{\sum_{i=1}^{m}f(S_{i})}{m}-\frac{f(\textsc{OPT}{})}{m}\geq(1-\frac{1}{m})f(\textsc{OPT}{})-(1+\beta)\max_{i=1}^{m}f(S_{i}). Similarly, we can claim that ρ(11m)/e1+(1+β)/e\rho\geq\frac{(1-\frac{1}{m})/e}{1+(1+\beta)/e}. This implies the desired lower bound of (11m)/e1+(1+β)/e\frac{(1-\frac{1}{m})/e}{1+(1+\beta)/e} on ρ\rho. ∎

2 Examples of β𝛽\beta-Nice Algorithms

In this section, we show that several existing algorithms for submodular maximization in the literature belong to the family of β\beta-nice algorithms.

Algorithm Greedy\mathsf{Greedy} with Consistent Tie-breaking: First, we observe that algorithm Greedy\mathsf{Greedy} is 11-nice if it has a consistent tie breaking rule: while selecting among the items with the same marginal value, Greedy\mathsf{Greedy} can have a fixed strict total ordering (Π\Pi) of the items, and among the set of items with the maximum marginal value chooses the one highest rank in Π\Pi. The consistency of the tie breaking rule implies the first property of nice algorithms. To see the second property, first observe that (i) Greedy\mathsf{Greedy} always adds an item with the maximum marginal ff value, and (ii) using submodularity of ff, the marginal ff values are decreasing as more items are added to the selected items. Therefore, after kk iterations, the marginal value of adding any other item, is less than each of the kk marginal ff values we achieved while adding the first kk items. This implies the 2nd property, and concludes that Greedy\mathsf{Greedy} with a consistent tie-breaking rule is a 11-nice algorithm.

An almost linear-time (1+ϵ)(1+\epsilon)-nice Algorithm: Badanidiyuru and Vondrak present an almost linear-time (11eϵ)(1-\frac{1}{e}-\epsilon)-approximation algorithm for monotone submodular maximization with a cardinality constraint. We observe that this algorithm is (1+2ϵ)(1+2\epsilon)-nice. The algorithm is a relaxed version of Greedy\mathsf{Greedy} where in each iteration, it adds an item with almost maximum marginal value (with at least 1ϵ1-\epsilon fraction of the maximum marginal). As a result, similar to the proof for Greedy\mathsf{Greedy}, one can show that this linear-time algorithm 11ϵ\frac{1}{1-\epsilon}-nice and consequently (1+2ϵ)(1+2\epsilon)-nice for ϵ0.5\epsilon\leq 0.5.

Hardness Results for Randomized Core-sets

In Section 2, we showed that a family of β\beta-nice algorithms are 12+β\frac{1}{2+\beta}-approximate randomized core sets (e.g., 13\frac{1}{3}-approximate for algorithm Greedy\mathsf{Greedy}). Here we show what kinds of randomized core-sets are not achievable. In particular, we prove, in Theorem 3.1 that if we restrict our attention to core-sets of size kk, algorithm Greedy\mathsf{Greedy} or any local search algorithm does not achieve an approximation factor better than 12\frac{1}{2} even if each item is sent to multiple machines (up to multiplicity C=o(m)C=o(\sqrt{m})). This leads to the following question: does increasing the output size of core-sets, kk^{\prime}, help with the approximation factor? In other words, can we get a better than 1/21/2 approximation factor if we allow the algorithm to select more than kk items on each machine? To answer this question, we first prove, in Theorem 3.3 that it is not possible to achieve a randomized composable core-set of size k=o(nCm)k^{\prime}=o(\frac{n}{Cm}) with approximation better than 11e1-\frac{1}{e} even when we allow for multiplicity C=o(mk)C=o(\sqrt{\frac{m}{k}}). We then show in Section 4 that although it is not possible to beat the 11e1-\frac{1}{e} barrier, we can slightly increase the output sizes, apply algorithm Greedy\mathsf{Greedy} to achieve an approximation factor 22>12\approx 2-\sqrt{2}>\frac{1}{2} with a constant multiplicity.

Following we show a limitation on core-sets of size kk. In particular, we introduce a family of instances for which algorithm Greedy\mathsf{Greedy} and any algorithm that returns a locally optimum solution of size at most kk do not achieve a better than 12+ϵ\frac{1}{2}+\epsilon-approximate core-set for any ϵ>0\epsilon>0. This lower-bound result applies to a coverage valuation (and therefore submodular) function ff and it holds even if we send each item to multiple machines.

For any ϵ>0\epsilon>0, assuming each item is sent to at least one random machine (to set TiT_{i} for a random 1in1\leq i\leq n), and at most Cϵm2C\leq\sqrt{\frac{\epsilon m}{2}} random machines, and the number of items an algorithm is allowed to return is at most kk, there exists a family of instances for which algorithm Greedy\mathsf{Greedy} and any other local search algorithm returns an at most (12+1k+ϵ)(\frac{1}{2}+\frac{1}{k}+\epsilon)-approximate composable core-set.

We say a machine is a good machine if for each 1ik1\leq i\leq k, it receives at least a set Bi,jB_{i,j} for some 1jL1\leq j\leq L, and we call it is a bad machine otherwise. At first, we show that the output of algorithm Greedy\mathsf{Greedy} or any local search algorithm on a good machine that has not received set A1A_{1} is exactly one set Bi,jB_{i,j} for each 1ik1\leq i\leq k, and nothing else. In other words, these algorithms do not return any of the sets A2,A3,,AkA_{2},A_{3},\cdots,A_{k} unless they have a set A1A_{1} as part of their input, or they are running on a bad machine. It is not hard to see that if A1A_{1} is not part of the input each set Bi,jB_{i,j} has marginal value kk if no other set Bi,jB_{i,j^{\prime}} for jjj^{\prime}\neq j has been selected before. On the other hand, the marginal value of each of the sets {Ai}i=2k\{A_{i}\}_{i=2}^{k} is k1k-1. So Greedy\mathsf{Greedy} or any local search algorithm does not select any of the sets {Ai}i=2k\{A_{i}\}_{i=2}^{k} unless some set Bi,jB_{i,j} has been selected for each 1ik1\leq i\leq k. The fact that the output sizes are limited to kk implies that sets {Ai}i=2k\{A_{i}\}_{i=2}^{k} are not selected.

Now it suffices to prove that most machines are good, and most sets in {Ai}i=2k\{A_{i}\}_{i=2}^{k} are not sent to a machine that has set A1A_{1} as well. To prove this, we show that each machine is good with probability at least 1ϵ2C1-\frac{\epsilon}{2C}. To see this, note that for each ii, there are LL identical sets Bi,jB_{i,j}, and the probability that a machine does not receive any of these LL copies is at most (11m)LeL/mϵ2Ck(1-\frac{1}{m})^{L}\leq e^{-L/m}\leq\frac{\epsilon}{2Ck}. So the probability that a machine is good is at least (1ϵ2Ck)k1ϵ2C(1-\frac{\epsilon}{2Ck})^{k}\geq 1-\frac{\epsilon}{2C}. Since each set is sent to at most CC machines, for each 1ik1\leq i\leq k, we know that set AiA_{i} is sent to only good machines with probability at least (1ϵ2C)C1ϵ2(1-\frac{\epsilon}{2C})^{C}\geq 1-\frac{\epsilon}{2}. We also know that the probability that for each 2ik2\leq i\leq k, the probability of set AiA_{i} sharing a machine with set A1A_{1} is at most C2mϵ2\frac{C^{2}}{m}\leq\frac{\epsilon}{2} since each set is sent to at most CC random machines. As a result, at most ϵ2+ϵ2=ϵ\frac{\epsilon}{2}+\frac{\epsilon}{2}=\epsilon fraction of sets {Ai}i=2k\{A_{i}\}_{i=2}^{k} are selected by at least one machine in expectation, and therefore in expectation the size of the union of all selected sets (not only the best kk of them) is at most k2+ϵ(k1)2k^{2}+\epsilon(k-1)^{2} which is less than 12+1k+ϵ\frac{1}{2}+\frac{1}{k}+\epsilon fraction of the value of the optimum kk sets ({Ai}i=1k\{A_{i}\}_{i=1}^{k}). ∎

Following, we prove that it is not possible to achieve a better than 11e+ϵ1-\frac{1}{e}+\epsilon approximation factor for submodular maximization subject to a cardinality constraint even if each item is sent to at most CC machines where CϵmC\leq\sqrt{\epsilon m}, and each machine is allowed to return k=ϵ(1ϵ)n8Cmk^{\prime}=\frac{\epsilon(1-\epsilon)n}{8Cm} items. We note that in this hardness result, the size of the output sets can be arbitrarily large in terms of kk, i.e. for instance kk^{\prime} could be Ω(2k)\Omega(2^{k}) for some values of n,m,Cn,m,C, and kk. This is an information theoretic hardness result that does not use any complexity theoretic assumption. In fact, the instance itself can be optimally solved on a single machine, but distributing the items among several machines makes it hard to preserve the optimum solution. Before presenting the hardness result, we state the following version of Chernoff bound (which we use in the proof) as given on page 267, Corollary A.1.10A.1.10 and Theorem A.1.13A.1.13 in :

Suppose X1,X2,,XnX_{1},X_{2},\cdots,X_{n} are 010-1 random variables such that Pr[Xi=1]=piPr[X_{i}=1]=p_{i}, and let μ=i=1npi\mu=\sum_{i=1}^{n}p_{i}, and X=i=1nXiX=\sum_{i=1}^{n}X_{i}. Then for any a>0a>0

For any ϵ>0\epsilon>0, k8ϵk\geq\frac{8}{\epsilon} and Cϵm4kC\leq\sqrt{\frac{\epsilon m}{4k}}, assuming each item is sent to at most CC machines randomly, and each machine can output at most k=ϵ(1ϵ)8C×nmk^{\prime}=\frac{\epsilon(1-\epsilon)}{8C}\times\frac{n}{m} items, there exists a family of instances for which no algorithm can guarantee a core-set of expected value more than (11e+ϵ)(1-\frac{1}{e}+\epsilon) fraction of the optimum solution.

So among the optimum solution items that are alone in their machines, at most ϵ4C\frac{\epsilon}{4C} fraction of them will be selected. We note that each item is sent to at most CC machines, therefore in total ϵk4\frac{\epsilon k}{4} optimum items will be selected in this category in expectation.

On the other hand, the total number of optimum items that share a machine with some other optimum item is at most (Ck)2mϵk4\frac{(Ck)^{2}}{m}\leq\frac{\epsilon k}{4} (an upper bound on the expected number of collisions). We conclude that in total at most ϵk2\frac{\epsilon k}{2} optimum items will be selected in expectation. Therefore the total value of the selected optimum items does not exceed ϵ2\frac{\epsilon}{2} fraction of the optimum solution.

Better Randomized Core-sets for Monotone Submodular Maximization

In this section, we prove that although it is not possible to beat the 11e1-\frac{1}{e} barrier, we can slightly increase the output sizes (to k=(2+1)kk^{\prime}=(\sqrt{2}+1)k), and apply algorithm Greedy\mathsf{Greedy} to achieve an approximation factor 22>12\approx 2-\sqrt{2}>\frac{1}{2} with a constant multiplicity. Furthermore, we show in Theorem 4.8 that our analysis is tight for algorithm Greedy\mathsf{Greedy} even if we increase the core-set sizes significantly. Finally, we present in Subsection 4.1 a post-processing algorithm PseudoGreedy\mathsf{PseudoGreedy} that achieves an overall distributed approximation factor better than 1/21/2. In particular, after the first phase, we show how to find a size kk subset of the union of selected items with expected value at least (0.545o(1))f(\textscOPT)(0.545-o(1))f(\textsc{OPT}{}). Since in this section, we are dealing with a monotone submodular function ff, we can assume WLOG that f()=0f(\emptyset)=0.

For any integer C1C\geq 1, any cardinality constraint k=o(m)k=o(m), algorithm Greedy\mathsf{Greedy} is a (22O(1k+ln(C)C))\left(2-\sqrt{2}-O\left(\frac{1}{k}+\frac{\ln(C)}{C}\right)\right)-approximate randomized composable core-set of multiplicity CC and size k=(22+1)kk^{\prime}=(2\sqrt{2}+1)k for any monotone submodular function ff. By letting C=1ϵC={1\over\epsilon}, this leads to a randomized composable core-set of approximation factor 0.58570.5857.

Let D=def22+1D\stackrel{{\scriptstyle\text{def}}}{{=}}2\sqrt{2}+1, and k=Dkk^{\prime}=Dk. Following our notation from Section 1.5, let Si=defGreedy(Ti)S_{i}\stackrel{{\scriptstyle\text{def}}}{{=}}\mathsf{Greedy}(T_{i}) for 1im1\leq i\leq m, where TiT_{i} is the set of items sent to the machine ii. Note that we can let Si=Dk|S_{i}|=Dk, since if there are less than DkDk items in TiT_{i}, WLOG we can assume the algorithm returns some extra dummy items just for the sake of analysis.

Consider an item x\textscOPTx\in\textsc{OPT}{}. We say that xx survives from machine ii, if, when we send xx to machine ii in addition to items of TiT_{i}, algorithm Greedy\mathsf{Greedy} would choose this item xx in its output of size kk^{\prime}, i.e., if xGreedy(Ti{x})}x\in\mathsf{Greedy}(T_{i}\cup\{x\})\}. For the sake of analysis, we partition the optimum solution into two sets as follows: let \textscOPT1\textsc{OPT}{}_{1} be the set of items in the optimum solution that would survive the first machine, i.e., \textscOPT1=def{xx\textscOPTGreedy(T1{x})}\textsc{OPT}{}_{1}\stackrel{{\scriptstyle\text{def}}}{{=}}\{x|x\in\textsc{OPT}{}\cap\mathsf{Greedy}(T_{1}\cup\{x\})\}. Let \textscOPT2=def\textscOPT\textscOPT1\textsc{OPT}{}_{2}\stackrel{{\scriptstyle\text{def}}}{{=}}\textsc{OPT}{}\setminus\textsc{OPT}{}_{1}, and k1=def\textscOPT1k_{1}\stackrel{{\scriptstyle\text{def}}}{{=}}|\textsc{OPT}{}_{1}|, and k2=def\textscOPT2k_{2}\stackrel{{\scriptstyle\text{def}}}{{=}}|\textsc{OPT}{}_{2}| (note that k1+k2=kk_{1}+k_{2}=k). We also define \textscOPT1=def\textscOPT1\textscOPTS\textsc{OPT}{}_{1}^{\prime}\stackrel{{\scriptstyle\text{def}}}{{=}}\textsc{OPT}{}_{1}\cap\textsc{OPT}{}^{S}, and \textscOPT=def\textscOPT1\textscOPT2\textsc{OPT}{}^{\prime}\stackrel{{\scriptstyle\text{def}}}{{=}}\textsc{OPT}{}_{1}^{\prime}\cup\textsc{OPT}{}_{2} where \textscOPTS\textsc{OPT}{}^{S} is defined in Subsection 1.5.

The optimum value f(\textscOPT)f(\textsc{OPT}{}) is equal to x\textscOPTΔ(x,πx)\sum_{x\in\textsc{OPT}{}}\Delta(x,\pi^{x}), and the term f(\textscOPT)f(\textscOPT)f(\textsc{OPT}{})-f(\textsc{OPT}{}^{\prime}) is at most x\textscOPT1\textscOPTSΔ(x,πx)\sum_{x\in\textsc{OPT}{}_{1}\setminus\textsc{OPT}{}^{S}}\Delta(x,\pi^{x}).

By definition of Δ\Delta values, we have that: x\textscOPTΔ(x,πx)=f(\textscOPT)f()=f(\textscOPT)\sum_{x\in\textsc{OPT}{}}\Delta(x,\pi^{x})=f(\textsc{OPT}{})-f(\emptyset)=f(\textsc{OPT}{}). Similarly, we have that f(\textscOPT)f(\textsc{OPT}{}^{\prime}) is equal to x\textscOPTΔ(x,πx\textscOPT)\sum_{x\in\textsc{OPT}{}^{\prime}}\Delta(x,\pi^{x}\cap\textsc{OPT}{}^{\prime}). By submodularity, we know that Δ(x,πx\textscOPT)\Delta(x,\pi^{x}\cap\textsc{OPT}{}^{\prime}) is at least Δ(x,πx)\Delta(x,\pi^{x}) because πx\textscOPT\pi^{x}\cap\textsc{OPT}{}^{\prime} is a subset of πx\pi^{x}. Therefore, we have:

where the last equality holds by definition of \textscOPT\textsc{OPT}{}^{\prime}. ∎

We note that Δ(x,πx)\Delta(x,\pi^{x}) is a fixed (non-random) term and therefore we can take it out of the expectation. Since f(\textscOPT)f(\textsc{OPT}{}) is equal to x\textscOPTΔ(x,πx)\sum_{x\in\textsc{OPT}{}}\Delta(x,\pi^{x}), we just need to prove that Pr[x\textscOPT1\textscOPTS]Pr[x\in\textsc{OPT}{}_{1}\setminus\textsc{OPT}{}^{S}] is at most O(ln(C)C)O\left(\frac{\ln(C)}{C}\right) for any item x\textscOPTx\in\textsc{OPT}{}.

We note that the first machine is just a random machine, and the distribution of set of items sent to it, T1T_{1}, is the same as any other set TiT_{i} for any 2im2\leq i\leq m. We consider two cases for an item x\textscOPTx\in\textsc{OPT}{}:

The probability of xx being chosen when added to a random machine is at most ln(C)C\frac{\ln(C)}{C}, i.e. Pr[xGreedy(T1{x})]=ln(C)CPr[x\in\mathsf{Greedy}(T_{1}\cup\{x\})]=\frac{\ln(C)}{C}.

The probability Pr[xGreedy(T1{x})]Pr[x\in\mathsf{Greedy}(T_{1}\cup\{x\})] is at least ln(C)C\frac{\ln(C)}{C}.

In the first case, we know that Pr[x\textscOPT1]ln(C)CPr[x\in\textsc{OPT}{}_{1}]\leq\frac{\ln(C)}{C}, and therefore Pr[x\textscOPT1\textscOPTS]ln(C)CPr[x\in\textsc{OPT}{}_{1}\setminus\textsc{OPT}{}^{S}]\leq\frac{\ln(C)}{C} which concludes the proof.

Using Lemma 4.2, it is sufficient prove that ratio fk(S1\textscOPT1)f(\textscOPT)\frac{f_{k}(S_{1}\cup\textsc{OPT}{}_{1}^{\prime})}{f(\textsc{OPT}{}^{\prime})} is at least 22O(1k)2-\sqrt{2}-O\left(\frac{1}{k}\right). In order to lower bound the ratio fk(S1\textscOPT1)f(\textscOPT)\frac{f_{k}(S_{1}\cup\textsc{OPT}{}_{1}^{\prime})}{f(\textsc{OPT}{}^{\prime})}, we write the following factor-revealing linear program LPk,k2LP^{k,k_{2}}, and prove in Lemma 4.4 that the solution to this LP is a lower bound on the aforementioned ratio.

For any integer k>0k>0, the ratio fk(S1\textscOPT1)f(\textscOPT)\frac{f_{k}(S_{1}\cup\textsc{OPT}{}_{1}^{\prime})}{f(\textsc{OPT}{}^{\prime})} is lower bounded by the optimum solution of linear program LPk,k2LP^{k,k_{2}} for some integer 1k2k1\leq k_{2}\leq k.

We want to prove that ratio fk(S1\textscOPT1)f(\textscOPT)\frac{f_{k}(S_{1}\cup\textsc{OPT}{}_{1}^{\prime})}{f(\textsc{OPT}{}^{\prime})} is lower bounded by the solution of minimization linear program LPk,k2LP^{k,k_{2}}. It suffices to construct one feasible solution with objective value β\beta equal to fk(S1\textscOPT1)f(\textscOPT)\frac{f_{k}(S_{1}\cup\textsc{OPT}{}_{1}^{\prime})}{f(\textsc{OPT}{}^{\prime})}. At first, we construct this solution for every instance of the problem, and then prove its feasibility in Claim 4.5.

We remind that \textscOPT\textsc{OPT}{}^{\prime} is the union of two disjoint sets \textscOPT1\textsc{OPT}{}_{1}^{\prime}, and \textscOPT2\textsc{OPT}{}_{2}. Fix a permutation π\pi on the items of \textscOPT\textsc{OPT}{}^{\prime} such that every item of \textscOPT1\textsc{OPT}{}_{1}^{\prime} appears before every item of \textscOPT2\textsc{OPT}{}_{2} in π\pi. In other words, π\pi is an arbitrary permutation on items of \textscOPT1\textsc{OPT}{}_{1}^{\prime} followed by an arbitrary permutation on items of \textscOPT2\textsc{OPT}{}_{2}. For any item xx in \textscOPT\textsc{OPT}{}^{\prime}, define πx\pi^{x} to be the set of items in \textscOPT\textsc{OPT}{}^{\prime} that appear prior to xx in permutation π\pi. For any 1jDk1\leq j\leq Dk, we define set SjS^{j} to be the first jj items of S1S_{1}. We set the linear program variables as follows:

The above assignment forms a feasible solution of LPk,k2LP^{k,k_{2}}, and its solution is equal to fk(S1\textscOPT1)f(\textscOPT)\frac{f_{k}(S_{1}\cup\textsc{OPT}{}_{1}^{\prime})}{f(\textsc{OPT}{}^{\prime})}.

The claim on the objective value of the solution is evident by definition of β\beta. To prove that the constraints hold, we show some simple and useful facts about the marginal values of items in OPT. We note that x\textscOPTΔ(x,πx)=f(\textscOPT)\sum_{x\in\textsc{OPT}{}^{\prime}}\Delta(x,\pi^{x})=f(\textsc{OPT}{}^{\prime}) by definition of Δ\Delta values and the fact that f()=0f(\emptyset)=0. Similarly for any 1jDk1\leq j\leq Dk, we have that:

We are ready to prove that all constraints (1),(2),,(5)(1),(2),\cdots,(5) one by one. We start with constraint (1)(1). For any set J[DK]J\subset[DK] with size J=k2|J|=k_{2}, we define S(J)S(J) to be {yjjJ}\{y_{j}|j\in J\} where yjy_{j} is the jjth item selected by algorithm Greedy\mathsf{Greedy} in S1S_{1}. We also define S(J)S^{\prime}(J) to be \textscOPT1S(J)\textsc{OPT}{}_{1}^{\prime}\cup S(J). Set S(J)S^{\prime}(J) is a subset of \textscOPT1S1\textsc{OPT}{}_{1}^{\prime}\cup S_{1} with size at most k2+k1=kk_{2}+k_{1}=k. Therefore f(S(J))f(S^{\prime}(J)) is a lower bound on fk(\textscOPT1S1)f_{k}(\textsc{OPT}{}_{1}^{\prime}\cup S_{1}). We can also lower bound f(S(J))f(S^{\prime}(J)) as follows:

The first equality holds by definition of Δ\Delta. The first inequality holds by submodularity of ff, and knowing that Sj1S(J)Sj1S^{j-1}\cap S(J)\subseteq S^{j-1}. The second equality holds by definition of α\alpha, and the last equality holds by definition of cjc_{j}. We claim that \Big{(}f(\textsc{OPT}{}_{1}^{\prime}\cup S^{j})-f(S^{j})\Big{)}-\Big{(}f(\textsc{OPT}{}_{1}^{\prime}\cup S^{j-1})-f(S^{j-1})\Big{)} (which is part of the right hand side of the last equality) is equal to bjf(\textscOPT)-b_{j}f(\textsc{OPT}{}^{\prime}). We note that \Big{(}f(\textsc{OPT}{}_{1}^{\prime}\cup S^{j})-f(S^{j})\Big{)} is equal to x\textscOPT1Δ(x,πxSj)\sum_{x\in\textsc{OPT}{}_{1}^{\prime}}\Delta(x,\pi^{x}\cup S^{j}), and similarly \Big{(}f(\textsc{OPT}{}_{1}^{\prime}\cup S^{j-1})-f(S^{j-1})\Big{)} is equal to x\textscOPT1Δ(x,πxSj1)\sum_{x\in\textsc{OPT}{}_{1}^{\prime}}\Delta(x,\pi^{x}\cup S^{j-1}). By taking the difference of them, we have:

which is (by definition of bjb_{j}) equal to bjf(\textscOPT)-b_{j}f(\textsc{OPT}{}^{\prime}). We conclude that f(S)f(S^{\prime}), and consequently fk(S1\textscOPT1)f_{k}(S_{1}\cup\textsc{OPT}{}_{1}^{\prime}) are both at least (1α+jJaj+cj)f(\textscOPT)(1-\alpha+\sum_{j\in J}a_{j}+c_{j})f(\textsc{OPT}{}^{\prime}) which concludes the proof of constraint (1)(1).

We prove constraint (2)(2) using the fact that algorithm Greedy\mathsf{Greedy} selects the item with maximum marginal value in each step. We note that the right hand side of constraint (2)(2) is aj+bj+cja_{j}+b_{j}+c_{j} which is by definition the marginal gain of item yjy_{j} divided by f(\textscOPT)f(\textsc{OPT}{}^{\prime}), i.e. Δ(yj,Sj1)f(\textscOPT)\frac{\Delta(y_{j},S^{j-1})}{f(\textsc{OPT}{}^{\prime})}. We know that any item x\textscOPT2x\in\textsc{OPT}{}_{2} will not be selected by algorithm Greedy\mathsf{Greedy} if it is part of the input set which means that the marginal gain (aj+bj+cj)f(\textscOPT)=Δ(yj,Sj1)(a_{j}+b_{j}+c_{j})f(\textsc{OPT}{}^{\prime})=\Delta(y_{j},S^{j-1}) is at least the marginal gain Δ(x,Sj1)\Delta(x,S^{j-1}) for any x\textscOPT2x\in\textsc{OPT}{}_{2}, and it is also greater than the average of these marginal gains. In other words, we have:

To finish the proof of constraint (2)(2), it suffices to prove the following inequality:

By definition of α\alpha, and aa values, we have that:

which completes the proof of Equation 2, and consequently constraint (2)(2).

Now, we prove that constraint (3)(3) holds. By definition of α\alpha, and the fact that f(\textscOPT)=f(\textsc{OPT}{}^{\prime})= x\textscOPT\sum_{x\in\textsc{OPT}{}^{\prime}} Δ(x,πx)\Delta(x,\pi^{x}), we know that 1α1-\alpha is equal to x\textscOPT1Δ(x,πx)f(\textscOPT)\frac{\sum_{x\in\textsc{OPT}{}_{1}^{\prime}}\Delta(x,\pi^{x})}{f(\textsc{OPT}{}^{\prime})}. We also know that:

where the inequality holds because valuation function ff is monotone, and therefore all Δ\Delta values are non-negative. This proves that constraint (3)(3) holds.

To prove constraint (4)(4), we should show that variables aj,bja_{j},b_{j}, cjc_{j}, and α\alpha are all in range $foranyfor any1\leq j\leq DK.Weknowthat. We know thatf(\textsc{OPT}{}^{\prime})isequaltois equal to\sum_{x\in\textsc{OPT}{}^{\prime}}\Delta(x,\pi^{x}).Thereforebydefinition,. Therefore by definition,\alphaisequaltois equal to\frac{\sum_{x\in\textsc{OPT}{}_{2}}\Delta(x,\pi^{x})}{\sum_{x\in\textsc{OPT}{}^{\prime}}\Delta(x,\pi^{x})}.Since. Since\textsc{OPT}{}_{2}isasubsetofis a subset of\textsc{OPT}{}^{\prime},weimplythat, we imply that\alphaisatmostis at most1.Wealsoknowthat. We also know that\Deltavaluesareallnonnegative,andthereforevalues are all non-negative, and therefore\alphaisnonnegative.Now,weshouldprovethatvariablesis non-negative. Now, we should prove that variablesa_{j},b_{j},and, andc_{j}areallinrangeare all in range.Bydefinition,. By definition,a_{j}+b_{j}+c_{j}isequaltois equal to\frac{\Delta(y_{j},S^{j-1})}{f(\textsc{OPT}{}^{\prime})}\leq\frac{f(\{y_{j}\})-f(\emptyset)}{f(\textsc{OPT}{}^{\prime})}=\frac{f(\{y_{j}\})}{f(\textsc{OPT}{}^{\prime})}wheretheinequalityandequalityareimpliedbythesubmodularityofwhere the inequality and equality are implied by the submodularity off,andthefact, and the factf(\emptyset)=0respectively.Ifrespectively. Iff(\{y_{j}\})isatleastis at leastf(\textsc{OPT}{}^{\prime}),theproofofLemma4.4canbecompletedasfollows.Weknowthat, the proof of Lemma 4.4 can be completed as follows. We know thatf_{k}(S_{1}\cup\textsc{OPT}{}_{1}^{\prime})\geq f(\{y_{j}\}),andthereforetheratio, and therefore the ratio\frac{f_{k}(S_{1}\cup\textsc{OPT}{}_{1}^{\prime})}{f(\textsc{OPT}{}^{\prime})}isatleastis at least1.Ontheotherhand,thereexistsaverysimplesolutionfor. On the other hand, there exists a very simple solution forLP^{k_{2},k}withobjectivevaluewith objective value\beta=1byjustsettingallvariablestozero,andby just setting all variables to zero, and\betaequaltoonewhichcompletestheproofinthecaseequal to one which completes the proof in the casef(\{y_{j}\})\geq f(\textsc{OPT}{}^{\prime}).Sowecanfocusonthecase,. So we can focus on the case,f(\{y_{j}\})\leq f(\textsc{OPT}{}^{\prime})inwhichwehavein which we havea_{j}+b_{j}+c_{j}\leq\frac{f(\{y_{j}\})}{f(\textsc{OPT}{}^{\prime})}\leq 1.Soitsufficestoprovethatthesethreevariablesarenonnegativetoprovethatconstraint. So it suffices to prove that these three variables are non-negative to prove that constraint(4)holds.Variablesholds. Variablesa_{j}andandb_{j}arenonnegativebecauseare non-negative becausefissubmodular,andis submodular, andS^{j-1}isasubsetofis a subset ofS^{j}.WeuseEquation1toprovenonnegativityof. We use Equation 1 to prove non-negativity ofc_{j}.Bydefinition,. By definition,a_{j}+b_{j}isequaltois equal to\frac{\sum_{x\in\textsc{OPT}{}^{\prime}}\Delta(x,\pi^{x}\cup S^{j-1})-\Delta(x,\pi^{x}\cup S^{j})}{f(\textsc{OPT}{}^{\prime})}$. By applying Equation 1, we have:

where the last inequality holds because of monotonicity of ff.

We prove constraint (5)(5) as follows. At first, we show that the right hand side of constraint (5)(5) is simply equal to f(Sk)f(\textscOPT)\frac{f(S^{k})}{f(\textsc{OPT}{}^{\prime})}. We know that aj+bj+cj=f(Sj)f(Sj1)f(\textscOPT)a_{j}+b_{j}+c_{j}=\frac{f(S^{j})-f(S^{j-1})}{f(\textsc{OPT}{}^{\prime})} for each 1jk1\leq j\leq k. By a telescopic summation, we have that the right hand side of constraint (5)(5), j=1kaj+bj+cj\sum_{j=1}^{k}a_{j}+b_{j}+c_{j}, is equal to f(Sk)f()f(\textscOPT)=f(Sk)f(\textscOPT)\frac{f(S^{k})-f(\emptyset)}{f(\textsc{OPT}{}^{\prime})}=\frac{f(S^{k})}{f(\textsc{OPT}{}^{\prime})}. By definition of β\beta, and the fact that fk(S1\textscOPT1)f_{k}(S_{1}\cup\textsc{OPT}{}_{1}^{\prime}) is at least f(Sk)f(S^{k}), we conclude that constraint (5)(5) holds.

To prove constraint (6)(6), we note that by definition, aj+bj+cja_{j}+b_{j}+c_{j} is Δ(yj,Sj1)\Delta(y_{j},S^{j-1}). Since algorithm Greedy\mathsf{Greedy} chooses the item with maximum marginal value at each step, we have Δ(yj,Sj1)Δ(yj+1,Sj1)\Delta(y_{j},S^{j-1})\geq\Delta(y_{j+1},S^{j-1}). By submodularity, we have Δ(yj+1,Sj1)Δ(yj+1,Sj)=aj+1+bj+1+cj+1\Delta(y_{j+1},S^{j-1})\geq\Delta(y_{j+1},S^{j})=a_{j+1}+b_{j+1}+c_{j+1}. We conclude that constraint (6)(6) holds. Therefore the proofs of Claim 4.5, and Lemma 4.4 are also complete. ∎

Finally, we show that the solution of LPk,k2LP^{k,k_{2}} is at least 22O(1k)2-\sqrt{2}-O(\frac{1}{k}) for any possible value of k2k_{2} which concludes the proof of Theorem 4.1.

The optimum solution of linear program LPk,k2LP^{k,k_{2}} is at least 22O(1k)2-\sqrt{2}-O(\frac{1}{{k}}) for any 0k2k0\leq k_{2}\leq k.

We consider two cases: a) k2k10k_{2}\leq\frac{k}{10}, and b) k2>k10k_{2}>\frac{k}{10}. We first consider the former case which is easier to prove, and then focus on the latter case. If k2k_{2} is at most k10\frac{k}{10}, we prove that the objective function of linear program LPk,k2LP^{k,k_{2}} (which is β\beta) cannot be less than 0.60.6 which concludes the proof of this lemma. Since all variables are non-negative, we can apply constraint (2)(2) for each jj in range [1,k][1,k], and imply that j=1kaj+bj+cj\sum_{j=1}^{k}a_{j}+b_{j}+c_{j} is at least (1(11k2)k)α\left(1-\left(1-\frac{1}{k_{2}}\right)^{k}\right)\alpha. Using constraint (5)(5), we know that j=1kaj+bj+cj\sum_{j=1}^{k}a_{j}+b_{j}+c_{j} is a lower bound for β\beta. We are also considering the case k2k10k_{2}\leq\frac{k}{10}, therefore β\beta is at least (1(11k2)k)α(1e10)α0.9999α\left(1-\left(1-\frac{1}{k_{2}}\right)^{k}\right)\alpha\geq\left(1-e^{-10}\right)\alpha\geq 0.9999\alpha. If α\alpha is at least 0.5860.586, the claim of Lemma 4.6 is proved. So we assume α\alpha is at most 0.5860.586. We define three sets of indices: J1=def{1,2,,k2}J_{1}\stackrel{{\scriptstyle\text{def}}}{{=}}\{1,2,\cdots,k_{2}\}, J2=def{k2+1,k2+2,,2k2}J_{2}\stackrel{{\scriptstyle\text{def}}}{{=}}\{k_{2}+1,k_{2}+2,\cdots,2k_{2}\}, and J3=def{2k2+1,2k2+2,,3k2}J_{3}\stackrel{{\scriptstyle\text{def}}}{{=}}\{2k_{2}+1,2k_{2}+2,\cdots,3k_{2}\}. We note that these sets have size k2k_{2}, and therefore constraint (1)(1) should hold for them. If there exists some set JDkJ\subset Dk with size k2k_{2} such that jJaj+cj\sum_{j\in J}a_{j}+c_{j} is at least 0.3α0.3\alpha, we can use constraint (1)(1) to lower bound β\beta with 1α+0.3α=10.7α>221-\alpha+0.3\alpha=1-0.7\alpha>2-\sqrt{2}. Therefore we can assume that jJiaj+cj\sum_{j\in J_{i}}a_{j}+c_{j} is at most 0.3α0.3\alpha for i{1,2,3}i\in\{1,2,3\}. Using constraint (2)(2), we imply that:

We also know that J1J2J3{1,2,,k}J_{1}\cup J_{2}\cup J_{3}\subset\{1,2,\cdots,k\}. We apply constraint (5)(5) to imply that β\beta is at least (0.7+0.4+0.1)α=1.2α(0.7+0.4+0.1)\alpha=1.2\alpha. This yields a stronger upper bound on α\alpha. If α\alpha is at least 221.2<0.49\frac{2-\sqrt{2}}{1.2}<0.49, the claim of Lemma 4.6 is proved. So we assume α0.49\alpha\leq 0.49. We follow the above argument one more time, and the proof is complete. If for some i{1,2,3}i\in\{1,2,3\}, the sum jJiaj+cj\sum_{j\in J_{i}}a_{j}+c_{j} is at least 0.16α0.16\alpha, using constraint (1)(1), we can lower bound β\beta with 10.84α>221-0.84\alpha>2-\sqrt{2}. Therefore we have jJiaj+cj0.16α\sum_{j\in J_{i}}a_{j}+c_{j}\leq 0.16\alpha for each i{1,2,3}i\in\{1,2,3\} which yields the following stronger inequalities:

By applying constraint (5)(5), we have β(0.84+0.68+0.52)α>2α\beta\geq(0.84+0.68+0.52)\alpha>2\alpha. We can also use constraint (1)(1), and conclude that βmax{2α,1α}>22\beta\geq\max\{2\alpha,1-\alpha\}>2-\sqrt{2} which completes the proof for the former case k2k10k_{2}\leq\frac{k}{10}.

In the rest of the proof, we consider the latter case k2>k10k_{2}>\frac{k}{10}. The structure of the proof is as follows: In Claim 4.7, we first show that without loss of generality, one can assume a special structure in an optimum solution of LPk,k2LP^{k,k_{2}}, and then exploit this structure to show that any solution of LPk,k2LP^{k,k_{2}} is lower bounded by a simple system of two equations with some O(1k2)=O(1k)O(\frac{1}{k_{2}})=O(\frac{1}{k}) error. We can explicitly analyze this system of equations, and achieve a lower bound of 222-\sqrt{2} on β\beta (the key variable of the system of equations, and also the objective function of linear program LPk,k2LP^{k,k_{2}}).

There exists an optimum solution for linear program LPk,k2LP^{k,k_{2}} with the following three properties:

constraint (2)(2) is tight for all 1jDk1\leq j\leq Dk

It suffices to show that every optimum solution of LPk,k2LP^{k,k_{2}} without changing the objective β\beta can be transformed to a feasible solution with the above three properties. Consider an optimum solution (β,α,{aj,bj,cj}j=1Dk)\left(\beta^{*},\alpha^{*},\{a^{*}_{j},b^{*}_{j},c^{*}_{j}\}_{j=1}^{Dk}\right). We start by showing how cjc^{*}_{j}s can be set to zero. Suppose cj>0c^{*}_{j}>0 for some 1jDk1\leq j\leq Dk. We can increase the value of aja^{*}_{j} by cjc^{*}_{j}, and then set cjc^{*}_{j} to zero. This update keeps aj+cja^{*}_{j}+c^{*}_{j} intact, and therefore does not change anything in constraints (1)(1), (3)(3), (5)(5), and (6)(6). It also makes it easier to satisfy constraint (2)(2) since it (possibly) reduces the right hand side, and keeps the left hand side intact. Constraint (4)(4) remains satisfied since aj+cj1a^{*}_{j}+c^{*}_{j}\leq 1 (otherwise β\beta is also at least 11 which proves the claim of Lemma 4.6 directly). Therefore we can assume cc variables are equal to zero, and exclude them to have a simpler linear program:

Now we prove how to make aa^{*} variables monotone decreasing. Suppose for some j1<j2j_{1}<j_{2}, we have aj1=aj22δa^{*}_{j_{1}}=a^{*}_{j_{2}}-2\delta for some positive δ\delta. We set both variables aj1a^{*}_{j_{1}} and aj2a^{*}_{j_{2}} to their average, i.e. increase aj1a^{*}_{j_{1}} by δ\delta, and decrease aj2a^{*}_{j_{2}} by δ\delta.

We also decrease bj1b^{*}_{j_{1}} and increase bj2b^{*}_{j_{2}} by δ\delta:

Now, we show that all constraints holds one by one. We note that it suffices to consider constraint (1)(1) only for set JJ with maximum jJaj\sum_{j\in J}a^{*}_{j}. We can assume that this set JJ with maximum jJaj\sum_{j\in J}a^{*}_{j} cannot contain j1j_{1} without having j2j_{2} (either before of after the update) because aj1a^{*}_{j_{1}} is at most aj2a^{*}_{j_{2}} in both cases. Therefore, the right hand side of constraint (1)(1) is intact, and it still holds.

Constraints (2)(2), (5)(5), and (6)(6) are all intact because aj+bja^{*}_{j}+b^{*}_{j} is invariant in this operation for any 1jDk1\leq j\leq Dk.

Constraint (3)(3) holds because sum of bb^{*} variables remain the same. To prove constraint (4)(4) holds, it suffices to show that all variables stay in range $.Itisevidentfor. It is evident fora^{*}valuessincewearesettingthemtotheiraverage.Forvalues since we are setting them to their average. Forb^{*}values,wefirstprovethatvalues, we first prove thatb^{*}_{j_{1}}staysnonnegative.Wenotethatstays non-negative. We note thata^{*}_{j_{1}}+b^{*}_{j_{1}}\geq a^{*}_{j_{2}}+b^{*}_{j_{2}}usingconstraintusing constraint(6).Wealsohave. We also havea^{*}_{j_{1}}=a^{*}_{j_{2}},and, andb^{*}_{j_{2}}\geq 0.Therefore,. Therefore,b^{*}_{j_{1}}cannotbenegative.Toprovethatcannot be negative. To prove thatb^{*}_{j_{2}}is(still)atmostis (still) at most1,itsufficestonotethatthesumofall, it suffices to note that the sum of allb^{*}valuesis(still)atmostvalues is (still) at most1-\alpha\leq 1,and, andb^{*}$ variables are all non-negative. Therefore all constraints are still valid after this operation.

We prove that after a finite number of times (at most (DK2){DK\choose 2}) of applying this operation, we reach a feasible solution with monotone non-increasing sequence of aa^{*} values. If we start with j1=1j_{1}=1, and do this operation for any pair (j1,j2)(j_{1},j_{2}) with aj1<aj2a^{*}_{j_{1}}<a^{*}_{j_{2}}, after at most Dk1Dk-1 steps, we reach a solution in which a1aja^{*}_{1}\geq a^{*}_{j} for any 1<jDk1<j\leq Dk. We continue the same process by increasing j1j_{1} one by one, and after at most (DK2){DK\choose 2} updates, we reach a sorted sequence of aa^{*} values. By monotonicity of aa^{*} values, we can simplify the linear program even further:

Now we prove that we can assume that constraint (2)(2) is tight for all 1jDk1\leq j\leq Dk. At first, we prove by contradiction that the right hand side of constraint (2)(2) is non-negative. Let jj be the minimum index for which the right hand side of constraint (2)(2) is negative. We can set all aa^{*} and bb^{*} values to zero for any index greater than jj, and also reduce aj1a^{*}_{j-1} by some amount to make this right hand side zero. All constraints hold, and we will have a solution in which the right hand side of constraint (2)(2) is always non-negative. Now we make constraint (2)(2) always tight as follows. Let j1j_{1} be the maximum index in range [1,Dk][1,Dk], for which constraint (2)(2) is loose by some δ>0\delta>0. We update the variables as follows. If bj1b^{*}_{j_{1}} is positive, we reduce it to max{bj1δ,0}\max\{b^{*}_{j_{1}}-\delta,0\}, and do not change any other variable. We note that in this case, all constraints still hold, and constraint (2)(2) for all indices j1,j1+1,,Dkj_{1},j_{1}+1,\cdots,Dk is tight.

If bj1b^{*}_{j_{1}} is zero, we decrease aj1a^{*}_{j_{1}} by δ\delta, and for any j2>j1j_{2}>j_{1}, we increase aj2a^{*}_{j_{2}} by δk2(11k2)j2j11\frac{\delta}{k_{2}}(1-\frac{1}{k_{2}})^{j_{2}-j_{1}-1}. We prove that constraint (2)(2) for all indices j1,j1+1,,Dkj_{1},j_{1}+1,\cdots,Dk is tight, and all other constraints still hold after this update. By definition of δ\delta, constraint (2)(2) is tight for index j1j_{1} after this update. Constraint (2)(2) was tight before the update for j2>j1j_{2}>j_{1} because of the special choice of j1j_{1}. We prove that the right and left hand sides of constraint (2)(2) increased by the same amount for each j2>j1j_{2}>j_{1}. The left hand side increased by δk2(11k2)j2j11\frac{\delta}{k_{2}}(1-\frac{1}{k_{2}})^{j_{2}-j_{1}-1}. We also know that the right hand side changed by:

Therefore constraint (2)(2) is tight for all j2>j1j_{2}>j_{1} after the update. Now we prove feasibility of the new solution. For constraint (1)(1), we note that all increments of aa^{*} variables is less than δk2r=0(11k2)r=δ\frac{\delta}{k_{2}}\sum_{r=0}^{\infty}(1-\frac{1}{k_{2}})^{r}=\delta. We also note that aj1a^{*}_{j_{1}} is decreased by δ\delta. So the right hand side of constraint (1)(1) is decreased, and it remains feasible. We just showed that for j2j1j_{2}\geq j_{1}, constraint (2)(2) is tight and therefore valid, and it is intact for smaller indices. Constraint (3)(3) is also intact. To prove constraint (4)(4), we should show that the new aa^{*} values are in range $.Sinceconstraint. Since constraint(2)istight,andis tight, and\alphaisatmostis at most1,thesenew, these newa^{*}valuesareallatmostvalues are all at most1.Nonnegativityoftherighthandsideofconstraint. Non-negativity of the right hand side of constraint(2)impliesthatthesenewvaluesareallnonnegative.Constraintimplies that these new values are all non-negative. Constraint(5)holdssinceitsrighthandsideisonlydecreased.Toproveconstraintholds since its right hand side is only decreased. To prove constraint(6),wenotethattherighthandsidesofconstraint, we note that the right hand sides of constraint(2)isdecreasinginis decreasing inj,andtheyarealltightforindices, and they are all tight for indices\geq j_{1}.Thereforeconstraint. Therefore constraint(6)holdsforholds forj\geq j_{1}.For. Forj=j_{1}-1,itclearlyholdssincewearedecreasing, it clearly holds since we are decreasinga^{*}_{j_{1}},andconsequentlyitsrighthandside.Toproveconstraint, and consequently its right hand side. To prove constraint(7),wenotethattheincrementsin, we note that the increments ina^{*}valuesisdecreasingasvalues is decreasing asj_{2}>j_{1}increases.Soconstraintincreases. So constraint(7)remainsfeasibleforremains feasible forj>j_{1}.For. Forj=j_{1},wenotethat, we note thatb^{*}_{j_{1}}iszero.Sousingconstraintis zero. So using constraint(6),wecanproveconstraint, we can prove constraint(7)holdsforholds forj=j_{1}whichcompletesthefeasibilityproof.Eachtime,wemaketheseupdates,theindexwhich completes the feasibility proof. Each time, we make these updates, the indexj_{1}(themaximumindexforwhichconstraint(the maximum index for which constraint(2)isloose)reducesbyatleastis loose) reduces by at least(1).Thereforeafteratmost. Therefore after at mostDk$ operations, we have an optimum solution with all three properties of Claim 4.7. ∎

Using Claim 4.7, we can assume that the solution of LPk,k2LP^{k,k_{2}} is lower bounded by the next LPnew,k,k2LP^{new,k,k_{2}}. We focus on lower bounding the solution of LPnew,k,k2LP^{new,k,k_{2}} in the rest of the proof.

We note that we eliminated one of the lower bounds on β\beta. This only reduces the optimum solution of the linear program which is consistent with our approach. We also used Claim 4.7 to replace the inequality constraint (2)(2) with an equality constraint. We also removed the cc variables, and added the monotonicity constraint (6)(6). In the rest of the proof, we introduce some notation, and show some extra structure in the optimum solution of LPnew,k,k2LP^{new,k,k_{2}}. This will help us lower bound the optimum solution by analyzing a system of two equations explicitly.

We start with proving the extra structure. Let τ=defak2\tau\stackrel{{\scriptstyle\text{def}}}{{=}}a_{k_{2}}. We show that for any pair of indices 1j1<j2k21\leq j_{1}<j_{2}\leq k_{2}, either bj1b_{j_{1}} is zero or aj2a_{j_{2}} is equal to τ\tau. Let j1j_{1} be the minimum index with bj1>0b_{j_{1}}>0. If j1j_{1} is at least k2k_{2}, the claim holds clearly. So we consider j1<k2j_{1}<k_{2}. If aj1+1a_{j_{1}+1} is equal to τ\tau, by monotonicity of aa values, the claim is proved. So we define δ>0\delta>0 to be min{bj1,aj1+1τ}\min\{b_{j_{1}},a_{j_{1}+1}-\tau\}. We increase aj1a_{j_{1}}, and bj1+1b_{j_{1}+1} by δ\delta, and δ(11k2)\delta(1-\frac{1}{k_{2}}) respectively. We also decrease both of aj1+1a_{j_{1}+1}, and bj1b_{j_{1}} by δ\delta. After these changes, constraints (1)(1), (2)(2), (3)(3), (4)(4), and (5)(5) in LPnew,k,k2LP^{new,k,k_{2}} still hold. In particular, we made the changes in this special way to make sure that constraint (2)(2) still holds. Constraint (5)(5) also holds since the right hand side of constraint (2)(2) is decreasing in jj. But the monotonicity constraint (6)(6) may be violated for j=j1+1j=j_{1}+1. This happens if the new aj1+1a_{j_{1}+1} is less than aj1+2a_{j_{1}+2}. In this case, we swap the variables aj1+1a_{j_{1}+1} and aj1+2a_{j_{1}+2}. We also change bj1+1b_{j_{1}+1}, and bj1+2b_{j_{1}+2} in a way that constraint (2)(2) holds for both j=j1+1j=j_{1}+1, and j=j1+2j=j_{1}+2. Similarly, we have that all constraints (1),(2),,(5)(1),(2),\cdots,(5) hold. But constraint (6)(6) may be violated for j=j1+2j=j_{1}+2. We continue doing this swap operation until constraint (6)(6) holds as well. This will happen after at most k2k_{2} swap operations, since the new aa values are all at least τ\tau. Finally, we reach a feasible solution for LPnew,k,k2LP^{new,k,k_{2}} in which either one more aa variable is equal to τ\tau or one more bb variable is set to zero. Therefore after at most 2k22k_{2} updates, for any pair of indices 1j1<j2k21\leq j_{1}<j_{2}\leq k_{2}, we have that either bj1b_{j_{1}} is zero or aj2a_{j_{2}} is equal to τ\tau.

We also claim that for any j>k2j>k_{2}, we can assume either aj=τa_{j}=\tau, or bj=0b_{j}=0. Otherwise, suppose j1>k2j_{1}>k_{2} is the smallest index for which aj1<τa_{j_{1}}<\tau, and bj1>0b_{j_{1}}>0. We can increase aj1a_{j_{1}} by δ=min{τaj1,bj1}\delta=\min\{\tau-a_{j_{1}},b_{j_{1}}\}, and decrease bj1b_{j_{1}} by δ\delta. We note that since aj1+bj1a_{j_{1}}+b_{j_{1}} is invariant, the monotonicity constraints (5)(5) still holds. We need to prove constraint (6)(6) for j=j11j=j_{1}-1. It can be violated only if aj11a_{j_{1}-1} is less than τ\tau, and bj11b_{j_{1}-1} is non-negative which contradicts the choice of j1j_{1}. To prove feasibility, we only need to prove constraint (2)(2) for j>j1j>j_{1}. We make it hold by the following adjustments. We start by j=j1+1j=j_{1}+1, and increase it one by one. If constraint (2)(2) is loose by some ϵ\epsilon for index jj, we decrease bjb_{j} by min{bj,ϵ}\min\{b_{j},\epsilon\}. We also decrease aja_{j} by max{ϵbj,0}\max\{\epsilon-b_{j},0\}. The sum aj+bja_{j}+b_{j} is reduced by ϵ\epsilon, and constraint (2)(2) now holds for jj. It is also clear that sum of bb variables do not increase, and all other constraints still hold. With these adjustments for each j>j1j>j_{1} in the increasing order, we know the solution is feasible. We also have all the ingredients to characterize the optimum solution of LPnew,k,k2LP^{new,k,k_{2}}, and lower bound it.

We can formalize this optimum solution in terms of a few parameters including τ,β,k,\tau,\beta,k, and k2k_{2}. At this final stage of the proof, we conclude with two lower bounds (system of two equations) on β\beta and 1α1-\alpha in terms of these few parameters. Let tt be the smallest index in range 1tk21\leq t\leq k_{2} with bt0b_{t}\neq 0. If such an index does not exist, define tt to be k2k_{2}. Using Claim 4.7, we know that constraint (2)(2) is tight, and bj=0b_{j}=0 for any j<tj<t, we can inductively prove that aj=αk2(k21k2)j1a_{j}=\frac{\alpha}{k_{2}}(\frac{k_{2}-1}{k_{2}})^{j-1} for any j<tj<t. Consequently, we have that j=1t1aj\sum_{j=1}^{t-1}a_{j} is equal to α(1(k21k2)t1)α(1er)\alpha(1-(\frac{k_{2}-1}{k_{2}})^{t-1})\geq\alpha(1-e^{-r}) where rr is defined to be t1k2\frac{t-1}{k_{2}}. Therefore, constraint (1)(1) implies that: β1α+(1er)α+(1r)k2τ\beta\geq 1-\alpha+(1-e^{-r})\alpha+(1-r)k_{2}\tau. This lower bound on β\beta is the first inequality we wanted to prove. To achieve the second inequality (lower bound on 1α1-\alpha), we start by upper bounding the sum of aa variables.

We show that j=1tajα(1er+2k2)\sum_{j=1}^{t}a_{j}\leq\alpha\left(1-e^{-r}+\frac{2}{k_{2}}\right) as follows. Since aa variables are monotone and, constraint (2)(2) is tight for j=1j=1, we have ata1=αk2a_{t}\leq a_{1}=\frac{\alpha}{k_{2}}. We also have that j=1t1aj\sum_{j=1}^{t-1}a_{j} is equal to α(1(11k2)t1)\alpha\left(1-\left(1-\frac{1}{k_{2}}\right)^{t-1}\right). We can upper bound (11k2)k21\left(1-\frac{1}{k_{2}}\right)^{k_{2}-1} by e1e^{-1} as follows. We prove that (11k2)k21\left(1-\frac{1}{k_{2}}\right)^{k_{2}-1} is a monotone decreasing sequence for k2=1,2,k_{2}=1,2,\cdots.

We also know that limk2(11k2)k21=e1\lim_{k_{2}\to\infty}\left(1-\frac{1}{k_{2}}\right)^{k_{2}-1}=e^{-1}. Therefore each term (11k2)k21\left(1-\frac{1}{k_{2}}\right)^{k_{2}-1} is at least e1e^{-1}. Therefore, we have that

which yields the desired upper bound on j=1taj\sum_{j=1}^{t}a_{j}.

We conclude that if β\beta^{*} is the solution of linear program LPnew,k,k2LP^{new,k,k_{2}}, the following system of equations should have a solution with β=β\beta=\beta^{*}:

where λ\lambda is defined to be k2τk_{2}\tau. We note that α,λ,\alpha,\lambda, and rr are the variables of the above system of two equations, and they should be in range $.Wealsonotethat. We also note thatk_{2}isanothervariablewhichcanbeanypositiveinteger.Tosimplify,wesolvethefollowingsystemofequationstoeliminateis another variable which can be any positive integer. To simplify, we solve the following system of equations to eliminatek_{2}$:

It is easy to see that if system of equations 3 has a solution (β1,α1,λ1,r1,k2)(\beta_{1},\alpha_{1},\lambda_{1},r_{1},k_{2}), system of equations 4 has the following solution: (β=β1+2k2,α=α1,λ=λ1+2k2,r=r1)(\beta=\beta_{1}+\frac{2}{k_{2}},\alpha=\alpha_{1},\lambda=\lambda_{1}+\frac{2}{k_{2}},r=r_{1}). Therefore it suffices to lower bound β\beta in system of equations 4. Because the same lower bound plus the term 2k2=O(1k)\frac{2}{k_{2}}=O(\frac{1}{k}) holds for β\beta in system of equations 3.

By computing the partial derivatives, and considering boundary values, one can find the minimum β\beta for which the system of equations 4 has a valid solution. Its minimum occurs when rr is zero, and the second inequality is tight. Therefore we have α=λ(2λ)\alpha=\sqrt{\lambda(2-\lambda)}. We conclude that β\beta is the minimum of 1λ(2λ)+λ1-\sqrt{\lambda(2-\lambda)}+\lambda which is equal to 1(112)(1+12)+(112)=2212=220.58571-\sqrt{(1-\sqrt{\frac{1}{2}})(1+\sqrt{\frac{1}{2}})}+(1-\sqrt{\frac{1}{2}})=2-2\sqrt{\frac{1}{2}}=2-\sqrt{2}\approx 0.5857 and occurs at λ=1120.2928\lambda=1-\sqrt{\frac{1}{2}}\approx 0.2928.

So we have a slightly different set of two inequalities in this case to lower bound β\beta.

It is evident that both inequalities should be tight to minimize β\beta, and therefore α\alpha is equal to 1+λ1+Der\frac{1+\lambda}{1+D^{\prime}e^{-r}} where DD^{\prime} is D12\frac{D-1}{2}. So we can write β\beta as a function of just λ\lambda and rr:

To minimize β\beta, either λ\lambda should be at one of its boundary values {0,1}\{0,1\}, or the partial derivative βλ\frac{\partial\beta}{\partial\lambda} should be zero. For λ=1\lambda=1, β\beta cannot be less than 2rer11e>222-r-e^{-r}\geq 1-\frac{1}{e}>2-\sqrt{2}. For λ=0\lambda=0, we have 1αDαDα1-\alpha\geq D^{\prime}\alpha^{\prime}\geq D^{\prime}\alpha, so α\alpha is at most 11+D=21\frac{1}{1+D^{\prime}}=\sqrt{2}-1, and therefore β\beta is at least 1α221-\alpha\geq 2-\sqrt{2}. The only case to consider is when βλ=0\frac{\partial\beta}{\partial\lambda}=0 which means er1+Der\frac{e^{-r}}{1+D^{\prime}e^{-r}} should be equal to 1r1-r with a unique solution r=0.71±0.001r^{*}=0.71\pm 0.001. Therefore β\beta is equal to 1(1r)(1+λλ)=r>221-(1-r^{*})(1+\lambda-\lambda)=r^{*}>2-\sqrt{2}. We conclude that any feasible solution of linear program LPk,k2LP^{k,k_{2}} has β\beta at least 22O(1k)2-\sqrt{2}-O(\frac{1}{k}) which completes the proof. ∎

We show in the following Theorem that the (22)0.585(2-\sqrt{2})\approx 0.585 lower bound on the approximation ratio of the core-sets that Greedy\mathsf{Greedy} finds is tight even if we allow the core-sets to be significantly large.

For any ϵ>0\epsilon>0, and any core-set size kkk^{\prime}\geq k, there are instances of monotone submodular maximization problem with cardinality constraint kk for which Greedy\mathsf{Greedy} is at most a (22+O(ϵ))(2-\sqrt{2}+O(\epsilon))-approximate randomized composable core-set even if each item is sent to CϵmC\leq\sqrt{\epsilon m} machines.

We also note that for any 1ik21\leq i\leq k_{2}, with probability at least 1ϵ1-\epsilon, none of the at most CC copies of set RiR_{i} shares a machine with one copy of BB^{\prime}. There are at most CC copies of BB^{\prime}, and CC copies of RiR_{i}, and the probability that two sets are sent to the same machine is 1m\frac{1}{m}. So with probability at most C2mϵ\frac{C^{2}}{m}\leq\epsilon, one copy of RiR_{i} and one copy of BB^{\prime} are sent to the same machine.

where the last inequality holds for k=Ω(1ϵ)k=\Omega(\frac{1}{\epsilon}). On the other hand, the optimum solution consists of k2=k1k_{2}=k-1 row sets RiR_{i}s, and set BB^{\prime} with value at least:

We conclude that the expected value of maximum value size kk subset of selected sets is upper bounded by λ+1α+O(ϵ)=22+O(ϵ)\lambda+1-\alpha+O(\epsilon)=2-\sqrt{2}+O(\epsilon) times the optimum solution f(\textscOPT)f(\textsc{OPT}{}). ∎

We remind that in the first phase, each machine 1im1\leq i\leq m runs algorithm Greedy\mathsf{Greedy} on set TiT_{i} with k=Dkk^{\prime}=Dk (where DD is 22+12\sqrt{2}+1).By Theorem 4.1, there exists a size kk subset of selected items i=1mSi\cup_{i=1}^{m}S_{i} with expected value at least 0.585f(\textscOPT)0.585f(\textsc{OPT}{}), but we do not know how to find this set efficiently. If we apply algorithm Greedy\mathsf{Greedy} again on i=1mSi\cup_{i=1}^{m}S_{i} to select kk items in total, we achieve a distributed approximation factor of (11e)×0.5850.37(1-\frac{1}{e})\times 0.585\approx 0.37. In the following, we present a post-processing algorithm PseudoGreedy\mathsf{PseudoGreedy} that achieves an overall distributed approximation factor better than 1/21/2. In particular, after the first phase, we show how to find a size kk subset of the union of selected items i=1mSi\cup_{i=1}^{m}S_{i} with expected value at least (0.545o(1))f(\textscOPT)(0.545-o(1))f(\textsc{OPT}{}).

Algorithm PseudoGreedy\mathsf{PseudoGreedy} proceeds as follows: it first computes a family of candidate solutions of size k+O(1)k+O(1), and keeps the one candidate solution VV with the maximum value. It then lets SS to be a random size kk subset of VV, and returns SS as the solution. These candidate solutions, denoted by Sk2,IS_{k^{\prime}_{2},I} (for 1k2k1\leq k^{\prime}_{2}\leq k and 4I{1,,8}4I\subseteq\{1,\cdots,8\}) in Algorithm 1, are computed as follows: We first enumerate all kk possible values of 1k2k1\leq k^{\prime}_{2}\leq k (this notation is used to be consistent with the proof of Theorem 4.1). Then, by letting k1=kk2k^{\prime}_{1}=k-k^{\prime}_{2}, and k3=32k2128k_{3}=32\lceil\frac{k^{\prime}_{2}}{128}\rceil, we partition the first 8k38k_{3} items in set S1S_{1} We choose any machine instead of machine 1, and since the clustering is done at random, the analysis goes through. into eight subsets {Ai}i=18\{A_{i^{\prime}}\}_{i^{\prime}=1}^{8}, each of size k3k_{3}. The next step of the algorithm proceeds as follows: for any I{1,2,,8}I\subseteq\{1,2,\cdots,8\} with I4+k1k3\lvert I\rvert\leq 4+\frac{k^{\prime}_{1}}{k_{3}}, we initialize set Sk2,IS_{k^{\prime}_{2},I} with the union of all sets AiA_{i^{\prime}} where iIi^{\prime}\in I. Then for k1+(4I)k3k^{\prime}_{1}+(4-\lvert I\rvert)k_{3} iterations, we search all items in i=1mSi\cup_{i=1}^{m}S_{i}, and insert the item with the maximum marginal value to Sk2,IS_{k_{2},I}. Roughly speaking, by starting from all these subsets, we ensure that the selected set hits enough number of items in OPT. The upper bound we enforce on I\lvert I\rvert is to make sure that the number of iterations at this step is non-negative, i.e. k1+(4I)k30k^{\prime}_{1}+(4-\lvert I\rvert)k_{3}\geq 0. Finally we define VV to be the set Sk2,IS_{k^{\prime}_{2},I} with the maximum ff value, and return a random subset of size kk of VV as the output set SS.

1 Input: A collection of mm subsets {S1,,Sm}\{S_{1},\ldots,S_{m}\}. 2[0.6ex] Output: Set Si=1mSiS\subset\cup_{i=1}^{m}S_{i} with Sk|S|\leq k. 3[0.6ex] VV\leftarrow\emptyset; 4 forall the 1k2k1\leq k^{\prime}_{2}\leq k do 5 k332k2128k_{3}\leftarrow 32\lceil\frac{k^{\prime}_{2}}{128}\rceil; 6 k1kk2k^{\prime}_{1}\leftarrow k-k^{\prime}_{2}; 7 Partition the first 8k38k_{3} items of S1S_{1} into 88 sets {Ai}i=18\{A_{i^{\prime}}\}_{i^{\prime}=1}^{8} each of size k3k_{3}; 8 forall the I{1,2,,8}I\subseteq\{1,2,\cdots,8\} with I4+k1k3|I|\leq 4+\frac{k^{\prime}_{1}}{k_{3}} do 9 Sk2,IiIAiS_{k^{\prime}_{2},I}\leftarrow\cup_{i^{\prime}\in I}A_{i^{\prime}}; 10 for k1+(4I)k3k^{\prime}_{1}+(4-|I|)k_{3} times do 11 Find arg maxxi=1mSiΔ(x,Sk2,I)\operatorname*{arg\,max}_{x\in\cup_{i=1}^{m}S_{i}}\Delta(x,S_{k^{\prime}_{2},I}), and insert it to Sk2,IS_{k^{\prime}_{2},I}; 12 13 end for 14 if f(Sk2,I)>f(V)f(S_{k_{2},I})>f(V) then VSk2,IV\leftarrow S_{k_{2},I}; 15 16 end forall 17 18 end forall 19 SS\leftarrow a random size kk subset of VV; Algorithm 1 Algorithm PseudoGreedy\mathsf{PseudoGreedy}

Algorithm PseudoGreedy\mathsf{PseudoGreedy} returns a subset SS with size at most kk, and expected value at least (0.545O(1k+ln(C)C))f(\textscOPT)(0.545-O(\frac{1}{k}+\frac{\ln(C)}{C}))f(\textsc{OPT}{}).

Since algorithm PseudoGreedy\mathsf{PseudoGreedy} enumerates all kk possible values of k2k^{\prime}_{2}, in one of these trials, k2k^{\prime}_{2} is equal to k2=\textscOPT2k_{2}=|\textsc{OPT}{}_{2}|. From now on, we focus on this specific value of k2k^{\prime}_{2}. We remind that k3=32k2128k_{3}=32\lceil\frac{k^{\prime}_{2}}{128}\rceil. Let II^{\prime} be the set of indices that maximizes Sk2,IS_{k_{2},I}, i.e., I=defarg maxI{1,2,,8}&I4+k1k3f(Sk2,I)I^{\prime}\stackrel{{\scriptstyle\text{def}}}{{=}}\operatorname*{arg\,max}_{I\subseteq\{1,2,\cdots,8\}\&|I|\leq 4+\frac{k_{1}}{k_{3}}}f(S_{k_{2},I}). Define VV^{\prime} be the set Sk2,IS_{k_{2},I^{\prime}}. By definition, we have f(V)f(V)f(V^{\prime})\leq f(V). So it suffices to show that f(V)f(V^{\prime}) is at least f(\textscOPT)f(\textsc{OPT}{}^{\prime}) times the solution of LPrLP^{r} for some 0r10\leq r\leq 1 as follows. We define rr to be 4k34k3+k1\frac{4k_{3}}{4k_{3}+k_{1}}.

At first, we remind the feasible solution of LPk,k2LP^{k,k_{2}} constructed in the proof of Lemma 4.4, and then construct a feasible solution for LPrLP^{r} with β\beta equal to f(V)f(\textscOPT)\frac{f(V^{\prime})}{f(\textsc{OPT}{}^{\prime})} as follows. Fix a permutation π\pi on the items of \textscOPT\textsc{OPT}{}^{\prime} such that every item of \textscOPT1\textsc{OPT}{}_{1}^{\prime} appears before every item of \textscOPT2\textsc{OPT}{}_{2} in π\pi. In other words, π\pi is an arbitrary permutation on items of \textscOPT1\textsc{OPT}{}_{1}^{\prime} followed by an arbitrary permutation on items of \textscOPT2\textsc{OPT}{}_{2}. For any item xx in \textscOPT\textsc{OPT}{}^{\prime}, define πx\pi^{x} to be the set of items in \textscOPT\textsc{OPT}{}^{\prime} that appear prior to xx in permutation π\pi. We set α\alpha to be x\textscOPT2Δ(x,πx)f(\textscOPT)\frac{\sum_{x\in\textsc{OPT}{}_{2}}\Delta(x,\pi^{x})}{f(\textsc{OPT}{}^{\prime})}. For any 1j8k31\leq j\leq 8k_{3}, we define set SjS^{j} to be the first jj items of S1S_{1}. Now we are ready to define variables {aj,bj,cj}j=18k3\{a_{j},b_{j},c_{j}\}_{j=1}^{8k_{3}} as follows.

We let the variable cjc_{j} be the marginal gain of item yjy_{j} divided by f(\textscOPT)f(\textsc{OPT}{}^{\prime}) minus aj+bja_{j}+b_{j} where yjy_{j} is the jjth item selected in S1S_{1}, i.e., cj:=Δ(yj,Sj1)f(\textscOPT)ajbjc_{j}:=\frac{\Delta(y_{j},S^{j-1})}{f(\textsc{OPT}{}^{\prime})}-a_{j}-b_{j}.

We note that the last inequality holds assuming the numerator is non-negative. In case, the numerator is negative, the constraint (2)(2) holds using non-negativity of a,b,a^{\prime},b^{\prime}, and cc^{\prime} variables. So we just need to prove constraint (1)(1) (which is the most important constraint) of LPrLP^{r}. We remind that β\beta is defined to be f(V)f(\textscOPT)=maxI{1,2,,8}&I4+k1k3f(Sk2,I)f(\textscOPT)\frac{f(V^{\prime})}{f(\textsc{OPT}{}^{\prime})}=\max_{I\subseteq\{1,2,\cdots,8\}\&|I|\leq 4+\frac{k_{1}}{k_{3}}}\frac{f(S_{k_{2},I})}{f(\textsc{OPT}{}^{\prime})}. So for every I{1,2,,8}I\subseteq\{1,2,\cdots,8\} with size at most 4+k1k34+\frac{k_{1}}{k_{3}}, it suffices to prove that:

We note that Sk2,IS_{k_{2},I} consists of two sets of items:

Set S(I)S(I) with Ik3|I|k_{3} items corresponding to sets {Bi}iI\{B_{i^{\prime}}\}_{i^{\prime}\in I} which are added in the first phase. In other words, S(I)S(I) is iIBi=iIj=(i1)k3+1ik3{yj}\cup_{i^{\prime}\in I}B_{i^{\prime}}=\cup_{i^{\prime}\in I}\cup_{j=(i-1)k_{3}+1}^{ik_{3}}\{y_{j}\} where yjy_{j} is the jjth item of S1S_{1}.

k1+(4I)k3k_{1}+(4-|I|)k_{3} items added greedily in the second phase.

Similar to proof of Claim 4.5, we define S(I)S^{\prime}(I) to be \textscOPT1S(I)\textsc{OPT}{}_{1}^{\prime}\cup S(I). To be consistent in notation, we define JJ to be the indices of items in S(I)S(I), i.e. J=def{jyjS(I)}J\stackrel{{\scriptstyle\text{def}}}{{=}}\{j|y_{j}\in S(I)\}. Using the same argument in proof of Claim 4.5, we know that:

This means that there are \textscOPT1k1|\textsc{OPT}{}_{1}^{\prime}|\leq k_{1} items that can be added to S(I)S(I) in the second phase, and increase its value to f(S(I))f(S^{\prime}(I)). Since we are adding k1+(4I)k3k_{1}+(4-|I|)k_{3} items greedily in the second phase, our final value f(Sk2,I)f(S_{k_{2},I}) is at least:

where the equality holds by definition of rr. We can also lower bound f(S(I))f(S(I)) as follows.

where the inequality holds by submodularity of ff, and equalities hold by definition. By combining inequalities 6, 7, and 8, we conclude that:

Small-size Core-sets for Submodular Maximization

We start by presenting the hardness results for non-randomized core-sets.

For any 1kk1\leq k^{\prime}\leq k, the output of any algorithm is at most an O(kk)O(\frac{k^{\prime}}{k})-approximate composable non-randomized core-set for the submodular maximization problem.

Now we construct the following family of instances to achieve hardness results for randomized core-sets in the submodular maximization problem.

We define instance Ik,kI^{k,k^{\prime}} of the submodular maximization problem as follows. Define Γ\Gamma to be kk\lfloor\sqrt{k^{\prime}k}\rfloor. For each 1ikΓ1\leq i\leq k-\Gamma, we add an item that represents the set {i}\{i\}. For each kΓ<ikk-\Gamma<i\leq k, we add kk\lfloor\frac{k}{k^{\prime}}\rfloor identical items all representing the same set {i}\{i\}. The value of a subset SS of items, f(S)f(S), is equal to the cardinality of the union of all sets the items in SS represent. This is a coverage valuation function and subsequently monotone submodular.

For any 1kk1\leq k^{\prime}\leq k, with m=Θ(kk)m=\Theta(\frac{k}{k^{\prime}}) machines, the output of any algorithm is at most an O(kk)O(\sqrt{\frac{k^{\prime}}{k}})-approximate randomized composable core-set for the submodular maximization problem.

In Theorem 5.3, we proved that it is not possible to achieve better than O(kk)O(\sqrt{\frac{k^{\prime}}{k}})-approximate randomized composable core-sets for the submodular maximization problem. Here we show that this bound is tight. We prove this by applying a randomized algorithm (which is different from algorithm Greedy\mathsf{Greedy}):

For any 1kk1\leq k^{\prime}\leq k, and mkkm\geq\frac{k}{k^{\prime}} machines, the union of output sets by the above algorithm form an Ω(kk)\Omega(\sqrt{\frac{k^{\prime}}{k}})-approximate randomized core-set for the submodular maximization problem.

Similar to the proofs of Lemmas 2.3, and 2.4, we also have that:

Conclusion

The concept of composable core-sets has been introduced recently in the context of distributed and streaming algorithms and have been applied to several problems . In this paper, we introduced the concept of randomized composable core-sets and showed its effectiveness in maximizing submodular functions in a distributed manner. While we mainly discuss the cardinality constraint in this paper, we expect that the ideas and the proof techniques be applicable to more general packing constraints such as matroid constraints. There are several research problems that are interesting to explore in this line of research.

For the submodular maximization problem, it remains to find a randomized composable core-set of approximation factor 11e1-{1\over e}, or rule out the possibility of constructing such a core-set.

We discussed how the size and multiplicity of the composable core-set can help improve the approximation factor of the algorithms. It would be nice to get tight bounds on the approximation factor for each range of the size of composable core-sets. Moreover, it would be interesting to study the impact of increasing the multiplicity of the core-set on the achievable approximation factors.

While we provided a tight result for the small-size composable core-set problem, the achievable approximation factor is not satisfactory. A natural way to improve this factor is to apply the composable core-set idea iteratively, and achieve an improve approximation factor. Even for composable core-sets of size kk and above, it might be possible to improve the approximation factor by applying such a core-set multiple times. This approach leads to several interesting follow-up questions.

While randomized composable core-sets are applicable to random-order streaming models, applying the proof techniques in this paper may result in stronger approximation factors in pure random-order streaming models (compared to the ones presented here). We leave this problem to future research.

Finally, it would be nice to explore applicability of these ideas on other optimization and graph theoretic problems.

References