Representation, Approximation and Learning of Submodular Functions Using Low-rank Decision Trees

author: Pravesh Kothari, Department of Computer Science, University of Texas at Austin
published: May 15, 2014,   recorded: June 2013,   views: 3236


Related Open Educational Resources

Related content

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Lecture popularity: You need to login to cast your vote.


We study the complexity of approximate representation and learning of submodular functions over the uniform distribution on the Boolean hypercube {0,1}n. Our main result is the following structural theorem: any submodular function is ϵ-close in ℓ2 to a real-valued decision tree (DT) of depth O(1/ϵ2). This immediately implies that any submodular function is ϵ-close to a function of at most 2O(1/ϵ2) variables and has a spectral ℓ1 norm of 2O(1/ϵ2). It also implies the closest previous result that states that submodular functions can be approximated by polynomials of degree O(1/ϵ2) (Cheraghchi et al., 2012). Our result is proved by constructing an approximation of a submodular function by a DT of rank 4/ϵ2 and a proof that any rank-r DT can be ϵ-approximated by a DT of depth 52(r+log(1/ϵ)).

We show that these structural results can be exploited to give an attribute-efficient PAC learning algorithm for submodular functions running in time O(n2)⋅2O(1/ϵ4). The best previous algorithm for the problem requires nO(1/ϵ2) time and examples (Cheraghchi et al., 2012) but works also in the agnostic setting. In addition, we give improved learning algorithms for a number of related settings.

We also prove that our PAC and agnostic learning algorithms are essentially optimal via two lower bounds: (1) an information-theoretic lower bound of 2Ω(1/ϵ2/3) on the complexity of learning monotone submodular functions in any reasonable model (including learning with value queries); (2) computational lower bound of nΩ(1/ϵ2/3) based on a reduction to learning of sparse parities with noise, widely-believed to be intractable. These are the first lower bounds for learning of submodular functions over the uniform distribution.

See Also:

Download slides icon Download slides: colt2013_kothari_trees_01.pdf (4.1 MB)

Help icon Streaming Video Help

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment:

make sure you have javascript enabled or clear this field: