On the Stratification of Multi-Label Data

author: Grigorios Tsoumakas, Department of Informatics, Aristotle University of Thessaloniki
published: Oct. 3, 2011,   recorded: September 2011,   views: 4117


Related Open Educational Resources

Related content

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Lecture popularity: You need to login to cast your vote.


Strati ed sampling is a sampling method that takes into account the existence of disjoint groups within a population and produces samples where the proportion of these groups is maintained. In single-label classi cation tasks, groups are di erentiated based on the value of the target variable. In multi-label learning tasks, however, where there are multiple target variables, it is not clear how strati ed sampling could/should be performed. This paper investigates strati cation in the multi-label data context. It considers two strati cation methods for multi-label data and empirically compares them along with random sampling on a number of datasets and based on a number of evaluation criteria. The results reveal some interesting conclusions with respect to the utility of each method for particular types of multi-label datasets.

See Also:

Download slides icon Download slides: ecmlpkdd2011_tsoumakas_stratification_01.pdf (667.6┬áKB)

Help icon Streaming Video Help

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment:

make sure you have javascript enabled or clear this field: