Online Clustering of High-Dimensional Trajectories under Concept Drift

produced by: Data & Web Mining Lab
author: Georg Krempl, Institut für Statistik und Operations Research, University of Graz
published: Nov. 30, 2011,   recorded: September 2011,   views: 3154
Categories

Slides

Related Open Educational Resources

Related content

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Lecture popularity: You need to login to cast your vote.
  Bibliography

Description

Historical transaction data are collected in many applications, e.g., patient histories recorded by physicians and customer transactions collected by companies. An important question is the learning of models upon the primary objects (patients, customers) rather than the transactions, especially when these models are subjected to drift. We address this problem by combining advances of online clustering on multivariate data with the trajectory mining paradigm. We model the measurements of each individual primary object (e.g. its transactions), taken at irregular time intervals, as a trajectory in a high-dimensional feature space. Then, we cluster individuals with similar trajectories to identify sub-populations that evolve similarly, e.g. groups of customers that evolve similarly or groups of employees that have similar careers. We assume that the multivariate trajectories are generated by drifting Gaussian Mixture Models. We study (i) an EM-based approach that clusters these trajectories incrementally as a reference method that has access to all the data for learning, and propose (ii) an online algorithm based on a Kalman filter that efficiently tracks the trajectories of Gaussian clusters. We show that while both methods approximate the reference well, the algorithm based on a Kalman filter is faster by one order of magnitude compared to the EM-based approach.

See Also:

Download slides icon Download slides: ecmlpkdd2011_krempl_trajectories_01.pdf (790.0 KB)


Help icon Streaming Video Help

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment:

make sure you have javascript enabled or clear this field: