Sunita Sarawagi
homepage: | https://www.cse.iitb.ac.in/~sunita/ |
search externally: |
|
Description
My topics of interest span several fields including databases, data mining, machine learning and statistics. A good idea about my research interests can be obtained by following my publications. Some specific problems and projects on which I have worked are listed below.
- World Wide Tables: The goal of this project is to answer table queries by tapping partially structured sources like tables and lists on the web.
- Information Extraction and data integration: Recently, I have been interested in graphical models and their use for various extraction and integration problems. As part of this effort, I have developed a package for Conditional Random Fields (CRF) that can be downloaded from sourceforge.
- ALIAS: This is a prototype of an interesting and fairly compelling application of the use of machine learning techniques like Active Learning to ease the duplicate elimination task that arise in data cleaning.
- DATAMOLD: is a tool for Information Extraction (more like text segmentation) using learning based on Hidden Markov Models. This software has been licensed by a data cleaning consulting company to solve real-life address cleaning tasks.
- ICube: This is a project on which I worked actively between 1999-2001. It is about enhanced mining of multidimensional OLAP products. A web demo of ICube is available.
- New data mining operations: I have worked on temporal data mining. Currently interested in various multi-class, multi-label and multi-taxonomy learning problems.
- Database mining integration: I have worked on two different aspects of this problem. First on algorithmic and architectural issues related to expressing association rule mining algorithm, in a relational engine. Second, on deploying learnt models within a relational engine so as to allow close integration with SQL querying and optimization.
- Some past projects (pre-1996): In the past I have worked on various problems related to multidimensional OLAP indexing and aggregation computation. My PhD thesis was on query optimization and scheduling for tertiary memory databases.
- Ancient projects (pre-1991): I got my first glimpse to research in computer science theory through search problems arising in rectangle cutting and packing problems.
Lectures:
lecture![]() as author at Research Sessions, 1768 views |
invited talk![]() as author at 1st Workshop on Automated Knowledge Based Construction (AKBC), Grenoble 2010, 4482 views |
|||||||
lecture![]() as author at 25th International Conference on Machine Learning (ICML), Helsinki 2008, 5014 views |