Clustering with Prior Information
published: Jan. 19, 2010, recorded: December 2009, views: 3845
Report a problem or upload filesIf you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
In summary, we have demonstrated analytically that any small (but finite) amount of semi– supervision suppresses the phase transition in cluster detectability for the planted–bisection model, by shifting the detection threshold to its lowest possible value. For graphs where the links within and across the clusters have different weights, we found that semi–supervision leads to a detection threshold that depends on ρ. Furthermore, if J < K, then for ρ → 0+, the detection threshold converges to a value lower (better) from the one obtained via balancing within–cluster and inter–cluster weights. This suggests that for weighted graphs a small [but generic] semi-supervising can be employed for defining the very clustering structure. This definition is non-trivial, since it performs better than the weight-balancing definition. Note also that for weighted graphs the very notion of the detection threshold is not clear a priori, in contrast to unweighted networks, where the only possible definition goes via the connectivity balance α = γ. To illustrate this unclarity, consider a node connected to one cluster via few heavy links, and to another cluster via many light links. To which cluster this node should belong in principle? Our (speculative) answer is that the proper cluster assignment in this case can be defined via semi-supervising.
Link this pageWould you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !