Andrew Carlson
search externally:   Google Scholar,   Springer,   CiteSeer,   Microsoft Academic Search,   Scirus ,   DBlife


I am a Ph.D. student at Carnegie Mellon University in the Machine Learning Department, within the School of Computer Science. I am advised by Tom Mitchell. My research focuses on mining the Web for facts and information. I am particularly interested in building large-scale, minimally-supervised information extraction systems that can extract large repositories of facts from web text. This work is part of the Read the Web project. I proposed a thesis entitled "Coupled Semi-Supervised Learning" in May 2009, and aim to graduate in the summer of 2010. The latest published results from this work are presented in the paper "Coupled Semi-Supervised Learning for Information Extraction," which will be presented at WSDM 2010 in February.

I have also contributed to Professor Mitchell's work on machine learning and fMRI brain imaging data. The results of that work can be found in the Science paper "Predicting Human Brain Activity Associated with the Meanings of Nouns".

I am thankful for support from Yahoo! for my research in 2007-2009 through the PhD Student Fellowship Program. I was an intern with Yahoo!'s Data Mining and Research group, working with Scott Gaffney and Flavian Vasile, in the summer of 2008. Yahoo! has also been extremely helpful by providing CMU with access to its M45 computing cluster. The cluster has enabled web-scale research outside of corporations. Without it, I'd be unable to pursue my current line of research. I wrote a blog post with Justin Betteridge on Yahoo!'s Hadoop blog describing one way in which we've used M45 and Hadoop in our research.

I spent the summer of 2007 on an internship at Google in Pittsburgh working with Charles Schafer. A publication from that work was presented at ECML/PKDD 2008.

In the summer of 2006, I coordinated a reading group on semi-supervised natural language learning research.

Previously, I attended the University of Illinois at Urbana-Champaign, where I received a B.S. in Computer Science. I researched machine learning applied to natural language under Professor Dan Roth, as part of the Cognitive Computation Group. I also released and maintained the SNoW software package.


flag Coupled Semi-Supervised Learning for Information Extraction
as author at  Third ACM International Conference on Web Search and Data Mining - WSDM 2010,
flag Bootstrapping Information Extraction from Semi-structured Web Pages
as author at  European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD), Antwerp 2008,
together with: Charles Schafer,