## Online PCA with Spectral Bounds

author: Edo Liberty, Yahoo! Research
published: Aug. 20, 2015,   recorded: July 2015,   views: 2241
Categories

# Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.

# Description

This paper revisits the online PCA problem. Given a stream of $n$ vectors $x_t \in \R^d$ (columns of $X$) the algorithm must output $y_t \in \R^\ell$ (columns of $Y$) before receiving $x_{t+1}$. The goal of online PCA is to simultaneously minimize the target dimension $\ell$ and the error $\|X - (XY^\pinv)Y\|^2$. We describe two simple and deterministic algorithms. The first, receives a parameter $\Delta$ and guaranties that $\|X - (XY^\pinv)Y\|^2$ is not significantly larger than $\Delta$. It requires a target dimension of $\ell = O(k/\eps)$ for any $k,\eps$ such that $\Delta \ge \eps\sigma_1^2 + \sigma_{k+1}^2$. The second receives $k$ and $\eps$ and guaranties that $\|X - (XY^\pinv)Y\|^2 \le \eps\sigma_1^2 + \sigma_{k+1}^2$. It requires a target dimension of $O( k\log n/\eps^2)$. Different models and algorithms for Online PCA were considered in the past. This is the first that achieves a bound on the spectral norm of the residual matrix.