Correlation Search in Graph Databases
published: Aug. 14, 2007, recorded: August 2007, views: 11297
Report a problem or upload filesIf you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Correlation mining has gained great success in many application domains for its ability to capture the underlying dependency between objects. However, the research of correlation mining from graph databases is still lacking despite the fact that graph data, especially in various scientific domains, proliferate in recent years. In this paper, we propose a new problem of correlation mining from graph databases, called Correlated Graph Search (CGS). CGS adopts Pearson’s correlation coefficient as a correlation measure to take into consideration the occurrence distributions of graphs. However, the problem poses significant challenges, since every subgraph of a graph in the database is a candidate but the number of subgraphs is exponential. We derive two necessary conditions which set bounds on the occurrence probability of a candidate in the database. With this result, we design an efficient algorithm that operates on a much smaller projected database and thus we are able to obtain a significantly smaller set of candidates. To further improve the efficiency, we develop three heuristic rules and apply them on the candidate set to further reduce the search space. Our extensive experiments demonstrate the effectiveness of our method on candidate reduction. The results also justify the efficiency of our algorithm in mining correlations from large real and synthetic datasets.
Link this pageWould you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !