Mining for the Most Certain Predictions from Dyadic Data
published: Sept. 14, 2009, recorded: June 2009, views: 3551
Report a problem or upload filesIf you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
In several applications involving regression or classification, along with making predictions it is important to assess how accurate or reliable individual predictions are. This is particularly important in cases where due to finite resources or domain requirements, one wants to make decisions based only on the most reliable rather than on the entire set of predictions. This paper introduces novel and effective ways of ranking predictions by their accuracy for problems involving large-scale, heterogeneous data with a dyadic structure, i.e., where the independent variables can be naturally decomposed into three groups associated with two sets of elements and their combination. These approaches are based on modeling the data by a collection of localized models learnt while simultaneously partitioning (co-clustering) the data. For regression this leads to the concept of "certainty lift". We also develop a robust predictive modeling technique that identifies and models only the most coherent regions of the data to give high predictive accuracy on the selected subset of response values. Extensive experimentation on real life datasets highlights the utility of our proposed approaches.
Link this pageWould you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !