Queueing Theory (Delay Models)

Distribution-free testingalgorithms for monomials with asublinear number of queries

Elya Dolev & Dana RonTel-Aviv University

Property testing of (Boolean) functions(“standard/uniform” version)

f : {0,1}n  {0,1} - the tested function

F - family of functions (e.g. linear functions)

Given a dist. par.  and query access to f

f(x)

 If f  F, then accept w.p.  2/3

 If dist(f,F) >  then reject w.p  2/3 where dist(f,F) = mingF{dist(f,g)} and dist(f,g) = PrxU[f(x)  g(x)]

Property testing of (Boolean) functionsdistribution-free version

f : {0,1}n  {0,1} - the tested function

F - family of functions (e.g. linear functions)

D - (unknown) underlying distribution

Given a dist. par. , access to examplesdistributed by D and query access to f

f(x)

 If f  F, then accept w.p.  2/3

 If distD(f,F) >  then reject w.p  2/3 where distD(f,F) = mingF{distD(f,g)} and distD(f,g) = PrxD[f(x)  g(x)]

Inspired bydist-free PAClearning model[Valiant]

(Dist-free) Testing and Learning

Dist-free testing was initially considered in[Goldreich,Goldwasser,R].Observed that testing is no harder than (proper) learning(in particular, dist-free+queries).

Q1: When is standard/dist-free testing easier thanlearning?

Q2: What is relation btwn complexity of standard anddist-free testing?

Testing and Learning

Quite a few classes for which standard testing is easierthan learning (under the unif. dist. + queries):

• Linear functions [Blum,Luby,Rubinfeld]

• Low-degree polynomials [Rubinfeld&Sudan]

• Singletons, monomials, small monotone DNF[Parnas,R,Samorodintsky]

• Monotone functions [Ergun,Kannan,Kumar,Rubinfeld,Viswanathan][Dodis,Goldreich,Lehman,Raskhodnikova,R,Samorodintsky]

• Small juntas [Fischer,Kindler,R,Safra,Samorodintsky]

• Small decision lists, decision trees, DNF (general)[Diakonikolas,Lee,Matulef,Onak,Rubinfeld,Servedio,Wan]

• Linear thresh. functions [Matulef,O’Donnell,Rubinfeld,Servedio]

• . . .

Fewer positive results for dist-free testing[Halevy,Kushilevtz]x2. Tends to be more challenging.

Background on distribution-free testing

One of the main positive (and general) results: if classhas standard tester and can be self-corrected, thenhave dist-free tester [Halevy&Kushilevtz].

In particular gives dist-free testers for linearfunctions and low-degree polynomials.

What about other classes of interest (e.g., fromlearning point of view) which don’t have self-correctors?

Background on distribution-free testing

What about other classes of interest?

[Glasner&Servedio] considered question for monomials(monotone/general), decision lists, linear thresh. func.

Prove that every dist-free tester must perform((n/log(n))1/5) queries (for const. ), in contrast tostandard testing of classes where there is nodependence on n (and poly on 1/).

Shows that strong dependence on n is unavoidable, butcan we get some sublinear dependence on n?(Dist-free learning + queries requires linear dependence [Turan])

Our Results

We give a positive answer to the question formonomials – both monotone and general.

The complexity of our dist-free testingalgorithms is O(n1/2log(n)/).

Standard vs. dist-freetesting of monomials

When the underlying distribution is uniform (standardtesting), if f is a k-monomial, then Pr[f(x)=1] = 2-k,and so can effectively consider only monomials wherek = O(log(1/))).

This is not generally true in dist-free case.Specifically, lower bound of [GS] constructs functionsthat depend on many variables and underlying dist. Dhelps to “hide non-monomiality”.

Note: dist-free testing for (monotone) k-monomialswhen k is fixed, can be done using exp(k)samples+queries (combine [PRS] and [HK])

Dist-free testing of monotone monomials

Let MM denote the class of monotone monomials (overn variables).

Def of the violation hypergraph Hf of a function f:- Its vertex set is {0,1}n;- Each (hyper)edge is a subset e={y0,y1,…,yt} wheref(y0)=0 and f(yj)=1 for every j>0, such that there isno g in MM consistent with f on e.

Example: y0=010, y1=011, y2=110 (f(y0)=0, f(y1)=f(y2)=1)

x1 or x3 must bein monomial

x1 cannot bein monomial

x3 cannot bein monomial

(Notation: Z(y)={i: yi=0} (

Dist-free testing of monotone monomials

Def of the violation hypergraph Hf of a function f :- Its vertex set is {0,1}n;- Each (hyper)edge is a subset e={y0,y1,…,yt} where f(y0)=0 andf(yj)=1 for every j>0, s.t.

By def, if f is in MM then no edges in Hf .

Lemma: If distD(f,MM) > , then D(C) >  for everyvertex cover C of Hf .

Testing algorithm tries to find an edge in Hf.

Dist-free testing of monotone monomials

Testing algorithm tries to find an edge in Hf.

Notation: for Z[n], y(Z) has all coordinates in Zequal 0, and others 1 (e.g., y({1,3}) = 0101)

Basic building block: procedure that given y  f-1(0)searches for index j s.t. yj=0 and f(y({j}))=0 (i.e. xjmust be in monomial if f in MM).

Procedure performs binary search.- Starts with Z = Z(y).- In each iteration partitions Z to two equal parts Z1,Z2, and queries y(Z1) and y(Z2).- Continues with Zi s.t.f(y(Zi))=0 (if f(y(Z1))=f(y(Z2))=1 then{y(Z),y(Z1),y(Z2)} is an edge so can reject)- Stops when |Z|=1.

Dist-free testing of monotone monomials

Testing algorithm for MM

- Obtain sample T of (n1/2/) points dist.  D.

- For each y in T s.t. f(y)=0 run search proc. on y.

- If search failed for some y then reject (and halt).Otherwise, let J be union of all indices returned.

- Obtain sample T’ of (n1/2/) points dist.  D.

- If exists y’ in T’ s.t. f(y’)=1 and Z(y’)  J  then reject, o.w. accept.

Found edge{y(Z),y(Z1),y(Z2)}

Found edge{y({j}),y’}

Dist-free testing of monotone monomials

Testing algorithm for MM- Obtain sample T of (n1/2/) points dist.  D.- For each y in T s.t. f(y)=0 run search proc. on y.- If search failed for some y then reject (and halt).Otherwise, let J be union of all indices returned.- Obtain sample T’ of (n1/2/) points dist.  D.- If exists y’ in T’ s.t. f(y’)=1 and Z(y’)  J  then reject, o.w. accept.

Querycomplexity ofalg:|T|log(n)+|T’| =O(n1/2log(n)/)

If f in MM, alg always accepts.

If distD(f,MM) >  then prove that rejects w.p. 2/3.Prove contrapositive: If f is accepted w.p. > 1/3 thencan construct vertex cover C of Hf s.t. D(C) ≤ implying that distD(f,MM) ≤  .

Dist-free testing of general monomials

First, modify notion of violation hypergraph Hf : eachedge {y0,y1,…,yt} still satisfies f(y0)=0, f(yj)=1, j>0,but now, j>0Z(yj)  Z(y0) and j>0O(yj)  O(y0) .

Next, binary search is performed on y in f-1(0) but“w.r.t.” w in f-1(1). Search finds index j s.t. f(w’)=0for w’ that differs from w only on j’th coordinate. (inmonotone case, implicitly w = 1n).

After performing search on O(n1/2/) sample points inf-1(0) (w.r.t. same w) and obtaining set J of “relevantindices”, take additional sample and see if contains y inf-1(1) s.t. yj  wj for some j in J.

Summary and Open problems

Give sublinear (Õ(n1/2)) algorithms for dist-free testingof monotone/general monomials. (Alg for general monomialsextends alg for monotone monomials.)

Two natural questions:

• What is exact complexity of dist-free testing ofmonomials? (Lower bound of [GS] is (n1/5))

• What about other classes studied by [GS]? (Decisionlists and linear threshold functions.)

Thanks

Dist-free testing of monotone monomials

If f is accepted w.p. > 1/3 then can construct vertex cover C ofHf s.t. D(C) ≤  implying that distD(f,MM) ≤  .

First put in C all (very few) points y  f-1(0) for whichbinary search would fail.

For each other y  f-1(0) let j(y) be index found bybinary search (which is a det. proc.). For set J, letY(J) = {y  f-1(1) & Z(y’)  J   }.