(Enter summary)
Abstract: This paper shows how a text classifier's need for labeled training documents can be reduced by taking advantage of a large pool of unlabeled documents. We modify the Query-by-Committee (QBC) method of active learning to use the unlabeled pool for explicitly estimating document density when selecting examples for labeling. Then active learning is combined with ExpectationMaximization in order to "fill in" the class labels of those documents that remain unlabeled. Experimental results show that... (Update)
Cited by: More
Text Classification for Intelligent - Portfolio Management Young-Woo (2002)
(Correct)
Investigating Semantic Knowledge for Text Learning - Anupriya Ankolekar Forbes (2003)
(Correct)
Proceedings of the 9th Conference on Computational Natural.. - Pages Ann Arbor
(Correct)
Similar documents (at the sentence level): More
50.8%: Pool-Based Active Learning for Text Classification - Nigam, McCallum (1998)
(Correct)
17.2%: Employing EM in Pool-Based Active Learning for Text.. - McCallum, Nigam (1998)
(Correct)
12.2%: Using Unlabeled Data to Improve Text Classification - Nigam (2001)
(Correct)
Active bibliography (related documents): More All
0.6: Improving Text Classification by Shrinkage in a.. - McCallum, Rosenfeld, .. (1998)
(Correct)
0.1: Using EM to Classify Text from Labeled and Unlabeled Documents - Nigam (1998)
(Correct)
0.1: Neural Networks - Jordan, Bishop (1996)
(Correct)
Similar documents based on text: More All
0.6: Text Classification by Bootstrapping with Keywords, EM and.. - McCallum, Nigam (1999)
(Correct)
0.5: A Comparison of Event Models for Naive Bayes Text Classification - Mccallum, Nigam (1998)
(Correct)
0.5: A Parallel Learning Algorithm for Text Classification - Kruengkrai, Jaruskulchai (2002)
(Correct)
Related documents from co-citation: More All
8: Text categorization with Support Vector Machines: Learning with many relevant fe..
- Joachims - 1998
7: Active learning with committees for text categorization (context) - Liere, Tadepalli - 1997
6: A sequential algorithm for training text classifiers: Corrigendum and additional..
- Lewis - 1995
BibTeX entry: (Update)
Andrew McCallum and Kamal Nigam. Employing EM and pool-based active learning for text classification. In ICML-98, 1998. http://citeseer.comp.nus.edu.sg/68306.html More
@misc{ mccallum98employing,
author = "A. McCallum and K. Nigam",
title = "Employing EM and pool-based active learning for text classification",
text = "Andrew McCallum and Kamal Nigam. Employing EM and pool-based active learning
for text classification. In ICML-98, 1998.",
year = "1998",
url = "citeseer.comp.nus.edu.sg/68306.html" }
Citations (may not include all citations):
376
Text categorization with Support Vector Machines: Learning w..
- Joachims - 1998
168
Distributional clustering of English words
- Pereira, Tishby et al. - 1993
149
Learning to extract symbolic knowledge from the World Wide W..
- Craven, DiPasquo et al. - 1998
140
A comparison of event models for naive Bayes text classifica..
- McCallum, Nigam - 1998
135
A sequential algorithm for training text classifiers: Corrig..
- Lewis - 1995
130
A probabilistic analysis of the Rocchio algorithm with TFIDF..
- Joachims - 1997
111
Active learning with statistical models
- Cohn, Ghahramani et al. - 1996
97
A comparison of two learning algorithms for text categorizat..
- Lewis, Ringuette - 1994
80
Learning to classify text from labeled and unlabeled documen..
- Nigam, McCallum et al. - 1998
77
Neural network exploration using optimal experiment design
- Cohn - 1994
76
Supervised learning from incomplete data via an EM approach
- Ghahramani, Jordan - 1994
38
Active learning with committees for text categorization (context) - Liere, Tadepalli - 1997
34
A mixture of experts classifier with learning based on both .. (context) - Miller, Uyar - 1997
25
The effect of unlabeled samples in reducing the small sample.. (context) - Shahshahani, Landgrebe - 1994
9
and Rubin (context) - Dempster, Laird - 1977
2
the optimality of the simple Bayesian classifier under zero-.. (context) - from, via et al. - 1997
2
A sequential algorithm for training text classifiers (context) - ECML-, Gale - 1994
1
Committee-based sampling for training probabilistic classifi.. (context) - AAAI-, Engelson - 1995
1
and Tishby (context) - Learning, Freund et al. - 1997
1
Data Mining and Knowledge Discovery (context) - using, by et al. - 1997
The graph only includes citing articles where the year of publication is known.
Documents on the same site (http://www.cs.cmu.edu/People/knigam/): More
Learning to Extract Symbolic Knowledge from the World.. - Craven, DiPasquo.. (1998)
(Correct)
Learning to Extract Symbolic Knowledge from the World.. - Craven, DiPasquo.. (1998)
(Correct)
Building Domain-Specific Search Engines with Machine .. - McCallum, Nigam.. (1999)
(Correct)
Online articles have much greater impact More about CiteSeer.IST at NUS Add search form to your site Submit documents Feedback
CiteSeer.IST at NUS - Copyright Penn State and NEC. Hosted by the School of Computing, National University of Singapore.