About
While the machine learning community has primarily focused on analysing the output of a single data source, there has been relatively few attempts to develop a general framework, or heuristics, for analysing several data sources in terms of a shared dependency structure. Learning from multiple data sources (or alternatively, the data fusion problem) is a timely research area. Due to the increasing availability and sophistication of data recording techniques and advances in data analysis algorithms, there exists many scenarios in which it is necessary to model multiple, related data sources, i.e. in fields such as bioinformatics, multi-modal signal processing, information retrieval, sensor networks etc.
The open question is to find approaches to analyse data which consists of more than one set of observations (or view) of the same phenomenon. In general, existing methods use a discriminative approach, where a set of features for each data set is found in order to explicitly optimise some dependency criterion. However, a discriminative approach may result in an ad hoc algorithm, require regularisation to ensure erroneous shared features are not discovered, and it is difficult to incorporate prior knowledge about the shared information. A possible solution is to overcome these problems is a generative probabilistic approach, which models each data stream as a sum of a shared component and a private component that models the within-set variation.
In practice, related data sources may exhibit complex co-variation (for instance, audio and visual streams related to the same video) and therefore it is necessary to develop models that impose structured variation within and between data sources, rather than assuming a so-called 'flat' data structure. Additional methodological challenges include determining what is the 'useful' information to extract from the multiple data sources, and building models for predicting one data source given the others. Finally, as well as learning from multiple data sources in an unsupervised manner, there is the closely related problem of multitask learning, or transfer learning where a task is learned from other related tasks.
More information about workshop - http://web.mac.com/davidrh/LMSworkshop08/
Related categories
Uploaded videos:
Multiview Clustering via Canonical Correlation Analysis
Dec 20, 2008
·
7323 Views
Multi-View Dimensionality Reduction via Canonical Correlation Analysis
Dec 20, 2008
·
5758 Views
The Double-Barrelled LASSO (Sparse Canonical Correlation Analysis)
Dec 20, 2008
·
4839 Views
Learning Shared and Separate Features of Two Related Data Sets using GPLVMs
Dec 20, 2008
·
4768 Views
Multiview Fisher Discriminant Analysis
Dec 20, 2008
·
6398 Views
Selective Multitask Learning by Coupling Common and Private Representations
Dec 20, 2008
·
3229 Views
Regression Canonical Correlation Analysis
Dec 20, 2008
·
5946 Views
Multiple kernel learning for multiple sources
Dec 20, 2008
·
9342 Views
GP-LVM for Data Consolidation
Dec 20, 2008
·
5304 Views
Two-level infinite mixture for multi-domain data
Dec 20, 2008
·
3030 Views
Probabilistic Models for Data Combination in Recommender Systems
Dec 20, 2008
·
9327 Views
Discussion & Future Directions
Dec 20, 2008
·
3146 Views