Constrained Logistic Regression for Discriminative Pattern Mining
Report a problem or upload filesIf you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Analyzing differences in multivariate datasets is a challenging problem. This topic was earlier studied by finding changes in the distribution differences either in the form of patterns representing conjunction of attribute value pairs or univariate statistical analysis for each attribute in order to highlight the differences. All such methods focus only on change in attributes in some form and do not implicitly consider the class labels associated with the data. In this paper, we pose the difference in distribution in a supervised scenario where the change in the data distribution is measured in terms of the change in the corresponding classification boundary. We propose a new constrained logistic regression model to measure such a difference between multivariate data distributions based on the predictive models induced on them. Using our constrained models, we measure the difference in the data distributions using the changes in the classification boundary of these models. We demonstrate the advantages of the proposed work over other methods available in the literature using both synthetic and real-world datasets.
Link this pageWould you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !