Employing EM and Pool-Based Active Learning for Text Classification (1998)  (Make Corrections)  (16 citations)
Andrew Kachites McCallum

 @ NUS   Home/Search   Context   Related

 
View or download:
cmu.edu/People/kni...emactiveicml98.ps
cmu.edu/~mccallum/...ctiveicml98.ps.gz
cmu.edu/People/mcc...ctiveicml98.ps.gz
Cached:  PS.gz  PS  PDF  Image  Update  Help

From:  cmu.edu/People/knigam/ (more)
From:  cmu.edu/~webkb/
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: This paper shows how a text classifier's need for labeled training documents can be reduced by taking advantage of a large pool of unlabeled documents. We modify the Query-by-Committee (QBC) method of active learning to use the unlabeled pool for explicitly estimating document density when selecting examples for labeling. Then active learning is combined with ExpectationMaximization in order to "fill in" the class labels of those documents that remain unlabeled. Experimental results show that... (Update)

Cited by:   More
Text Classification for Intelligent - Portfolio Management Young-Woo (2002)   (Correct)
Investigating Semantic Knowledge for Text Learning - Anupriya Ankolekar Forbes (2003)   (Correct)
Proceedings of the 9th Conference on Computational Natural.. - Pages Ann Arbor   (Correct)

Similar documents (at the sentence level):   More
50.8%:   Pool-Based Active Learning for Text Classification - Nigam, McCallum (1998)   (Correct)
17.2%:   Employing EM in Pool-Based Active Learning for Text.. - McCallum, Nigam (1998)   (Correct)
12.2%:   Using Unlabeled Data to Improve Text Classification - Nigam (2001)   (Correct)

Active bibliography (related documents):   More   All
0.6:   Improving Text Classification by Shrinkage in a.. - McCallum, Rosenfeld, .. (1998)   (Correct)
0.1:   Using EM to Classify Text from Labeled and Unlabeled Documents - Nigam (1998)   (Correct)
0.1:   Neural Networks - Jordan, Bishop (1996)   (Correct)

Similar documents based on text:   More   All
0.6:   Text Classification by Bootstrapping with Keywords, EM and.. - McCallum, Nigam (1999)   (Correct)
0.5:   A Comparison of Event Models for Naive Bayes Text Classification - Mccallum, Nigam (1998)   (Correct)
0.5:   A Parallel Learning Algorithm for Text Classification - Kruengkrai, Jaruskulchai (2002)   (Correct)

Related documents from co-citation:   More   All
8:   Text categorization with Support Vector Machines: Learning with many relevant fe.. - Joachims - 1998
7:   Active learning with committees for text categorization (context) - Liere, Tadepalli - 1997
6:   A sequential algorithm for training text classifiers: Corrigendum and additional.. - Lewis - 1995

BibTeX entry:   (Update)

Andrew McCallum and Kamal Nigam. Employing EM and pool-based active learning for text classification. In ICML-98, 1998. http://citeseer.comp.nus.edu.sg/68306.html   More

@misc{ mccallum98employing,
  author = "A. McCallum and K. Nigam",
  title = "Employing EM and pool-based active learning for text classification",
  text = "Andrew McCallum and Kamal Nigam. Employing EM and pool-based active learning
    for text classification. In ICML-98, 1998.",
  year = "1998",
  url = "citeseer.comp.nus.edu.sg/68306.html" }
Citations (may not include all citations):
376   Text categorization with Support Vector Machines: Learning w.. - Joachims - 1998
168   Distributional clustering of English words - Pereira, Tishby et al. - 1993
149   Learning to extract symbolic knowledge from the World Wide W.. - Craven, DiPasquo et al. - 1998
140   A comparison of event models for naive Bayes text classifica.. - McCallum, Nigam - 1998
135   A sequential algorithm for training text classifiers: Corrig.. - Lewis - 1995
130   A probabilistic analysis of the Rocchio algorithm with TFIDF.. - Joachims - 1997
111   Active learning with statistical models - Cohn, Ghahramani et al. - 1996
97   A comparison of two learning algorithms for text categorizat.. - Lewis, Ringuette - 1994
80   Learning to classify text from labeled and unlabeled documen.. - Nigam, McCallum et al. - 1998
77   Neural network exploration using optimal experiment design - Cohn - 1994
76   Supervised learning from incomplete data via an EM approach - Ghahramani, Jordan - 1994
38   Active learning with committees for text categorization (context) - Liere, Tadepalli - 1997
34   A mixture of experts classifier with learning based on both .. (context) - Miller, Uyar - 1997
25   The effect of unlabeled samples in reducing the small sample.. (context) - Shahshahani, Landgrebe - 1994
9   and Rubin (context) - Dempster, Laird - 1977
2   the optimality of the simple Bayesian classifier under zero-.. (context) - from, via et al. - 1997
2   A sequential algorithm for training text classifiers (context) - ECML-, Gale - 1994
1   Committee-based sampling for training probabilistic classifi.. (context) - AAAI-, Engelson - 1995
1   and Tishby (context) - Learning, Freund et al. - 1997
1   Data Mining and Knowledge Discovery (context) - using, by et al. - 1997



The graph only includes citing articles where the year of publication is known.


Documents on the same site (http://www.cs.cmu.edu/People/knigam/):   More
Learning to Extract Symbolic Knowledge from the World.. - Craven, DiPasquo.. (1998)   (Correct)
Learning to Extract Symbolic Knowledge from the World.. - Craven, DiPasquo.. (1998)   (Correct)
Building Domain-Specific Search Engines with Machine .. - McCallum, Nigam.. (1999)   (Correct)

Online articles have much greater impact   More about CiteSeer.IST at NUS   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST at NUS - Copyright Penn State and NEC. Hosted by the School of Computing, National University of Singapore.