Differentiable Sparse Coding
Description
Prior work has shown that features which appear to be biologically plausible as well as empirically useful can be found by sparse coding with a prior such as a Laplacian (L1 ) that promotes sparsity. We show how smoother priors can preserve the benefits of these sparse priors while adding stability to the Maximum A-Posteriori (MAP) estimate that makes it more useful for prediction problems. Additionally, we show how to calculate the derivative of the MAP estimate efficiently with implicit differentiation. One prior that can be differentiated this way is KL-regularization. We demonstrate its effectiveness on a wide variety of applications, and find that online optimization of the parameters of the KL-regularized model can significantly improve prediction performance.
| Slides | |
| 0:00 | Differentiable Sparse Coding |
| 0:25 | 100,000 ft View |
| 1:18 | 10,000 ft view |
| 2:31 | Sparse Coding |
| 2:43 | As a combination of factors |
| 2:57 | Sparse coding uses optimization |
| 3:40 | Sparse vectors |
| 4:01 | Example: X=Handwritten Digits |
| 4:50 | Optimization vs. Projection part1 |
| 5:30 | Optimization vs. Projection part2 |
| 5:35 | Generative Model |
| 6:31 | Sparse Approximation part1 |
| 6:46 | Sparse Approximation part2 |
| 7:11 | Example: Squared Loss + L1 |
| 8:04 | L1 Sparse Coding |
| 8:39 | Differentiable Sparse Coding |
| 8:55 | L1 Regularization is Not Differentiable |
| 9:04 | Why is this unsatisfying? |
| 9:24 | Problem #1: Instability |
| 10:17 | Problem #2: No closed‐form Equation |
| 10:36 | Solution: Implicit Differentiation |
| 10:56 | Example: Squared Loss, KL prior |
| 11:02 | Handwritten Digit Recognition part1 |
| 11:19 | Handwritten Digit Recognition part2 |
| 11:28 | Handwritten Digit Recognition part3 |
| 11:44 | Handwritten Digit Recognition part4 |
| 11:47 | KL Maintains Sparsity |
| 12:26 | KL adds Stability |
| 13:27 | Performance vs. Prior |
| 13:32 | KL adds Stability |
| 14:40 | Performance vs. Prior |
| 15:09 | Classifier Comparison |
| 15:22 | Comparison to other algorithms |
| 15:34 | Transfer to English Characters part1 |
| 15:48 | Comparison to other algorithms |
| 16:10 | Transfer to English Characters part1 |
| 16:18 | Transfer to English Characters part2 |
| 16:39 | Transfer to English Characters part3 |
| 16:41 | Transfer to English Characters part4 |
| 16:52 | Text Application part1 |
| 17:24 | Text Application part2 |
| 17:38 | Text Application part3 |
| 17:40 | Movie Review Sentiment |
| 18:05 | Future Work |
| 18:32 | Future Work: Convex Sparse Coding |
| 19:03 | - Questions |
| 19:43 | - Questions |
Lecture rating
| People found this lecture: | ||
| Worth seeing | ||
| because it is: | ||
| Valuable and informative | ||
| Well presented | ||
| Easily understandable | ||
| Acceptably recorded | ||
| You need to login to cast your vote. | ||
Report a problem or upload files
If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Related content
SEE ALSO:
Link this page
Would you like to put a link to this lecture on your homepage?Go ahead! Copy the HTML snippet !



