## General Graph Refinement with Polynomial Delay

author: Jan Ramon, KU Leuven
published: Sept. 5, 2007,   recorded: August 2007,   views: 3257

# Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.

# Description

Of many graph mining algorithms an essential component is its procedure for enumerating graphs such that no two enumerated graphs are isomorphic. All frequent subgraph miners require such a component [14, 5, 1, 6], but also other data mining algorithms, such as for instance [7] require such a procedure, which is often called a “refinement operator” or a “join operator” depending on the enumeration (or can- didate generation) strategy. Consequently, in the data min- ing literature a huge amount of algorithms have been pre- sented for enumerating graphs without isomorphisms. None of these algorithms, however, have been shown to run with polynomial delay, i.e. when enumerating the graphs, even without accessing any data, in the worst case it may take exponential time to list the next graph. From a theoreti- cal perspective, this is a disappointing result, as Goldberg [3, 4] showed already in the early nineties that enumerating all connected graphs without isomorphs can be done with O(n6) polynomial delay. In this paper, we address this prob- lem. We show that there is a graph enumeration algorithm that can be used in graph mining algorithms and has O(n5) polynomial delay. Thus, we not only propose an algorithm to enumerate (the interesting) parts of the class of connected graphs, but also improve the earlier bound on graph enumer- ation that was shown by Goldberg. At the same time, our algorithm is general enough to be also applicable for other types of graphs than connected graphs. For instance, it can also be used for enumerating unconnected graphs, planar graphs, outerplanar graphs, and many other types of graphs, with polynomial delay. The datastructure that is produced by the algorithm allows membership queries to be executed in polynomial time: for any given graph, independent of its representation, we can determine in polynomial time if it is part of a set of enumerated subgraphs. In particular, in our algorithm it is not necessary to compute a canonical form. However, if this is desirable, we can use our algorithm to enumerate graphs in several canonical forms, for instance, the DFS and BFS canonical forms used in gSpan [14] and MoFA [1].