Alpha-Beta Divergences Discover Micro and Macro Structures in Data
published: Sept. 27, 2015, recorded: July 2015, views: 1790
Report a problem or upload filesIf you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Although recent work in non-linear dimensionality reduction investigates multiple choices of divergence measure during optimization , little work discusses the direct effects that divergence measures have on visualization. We study this relationship, theoretically and through an empirical analysis over 10 datasets. Our works shows how the α and β parameters of the generalized alpha-beta divergence can be chosen to discover hidden macro-structures (categories, e.g. birds) or micro-structures (fine-grained classes, e.g. toucans). Our method, which generalizes t-SNE , allows us to discover such structure without extensive grid searches over (α,β) due to our theoretical analysis: such structure is apparent with particular choices of (α,β) that generalize across datasets. We also discuss efficient parallel CPU and GPU schemes which are non-trivial due to the tree-structures employed in optimization and the large datasets that do not fully fit into GPU memory. Our method runs 20x faster than the fastest published code . We conclude with detailed case studies on the following very large datasets: ILSVRC 2012, a standard computer vision dataset with 1.2M images; SUSY, a particle physics dataset with 5M instances; and HIGGS, another particle physics dataset with 11M instances. This represents the largest published visualization attained by SNE methods. We have open-sourced our visualization code: http://rll.berkeley.edu/absne/.
Link this pageWould you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !