An Intelligent Topic-Specific Crawler Using Degree of  (Make Corrections)  
Relevance Sanguk Noh, Youngsoo Choi, Haesung Seo, Kyunghee Choi, Gihyun...

 @ NUS   Home/Search   Context   Related

 
View or download:
songsim.catholic.ac.kr...idealTSC04.pdf
Cached:  PS.gz  PS  PDF  Image  Update  Help

From:  faure.isti.cnr....CP2(Google)pdf (more)
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: It is indispensable that the users surfing on the Internet could have web pages classified into a given topic as correct as possible. Toward this ends, this paper presents a topic-specific crawler computing the degree of relevance and refining the preliminary set of related web pages using term frequency/ document frequency, entropy, and compiled rules. In the experiments, we test our topic-specific crawler in terms of the accuracy of its classification, the crawling efficiency, and the... (Update)

Active bibliography (related documents):   More   All
0.3:   Using Clustering Methods to Improve Ontology-Based Query.. - De Luca, Nürnberger (2006)   (Correct)
0.3:   Modeling User Interests By Conceptual Clustering - Godoy, Amandi (2005)   (Correct)
0.2:   Ranking Function Optimization For Effective Web Search By Genetic.. - Fox (2004)   (Correct)

Similar documents based on text:   More   All
0.2:   Bayesian Update of Recursive Agent Models - Gmytrasiewicz, Noh, Kellogg   (Correct)
0.1:   Rational Communicative Behavior in Anti-Air Defense - Noh (1998)   (Correct)
0.1:   Towards Flexible Multi-Agent Decision-Making Under Time.. - Noh, Gmytrasiewicz (1999)   (Correct)

BibTeX entry:   (Update)

@misc{ noh-intelligent,
  author = "Relevance Sanguk Noh",
  title = "An Intelligent Topic-Specific Crawler Using Degree of",
  url = "citeseer.comp.nus.edu.sg/707277.html" }
Citations (may not include all citations):
2177   Programs for Machine Learning (context) - Quinlan - 1993
1447   A mathematical theory of communication (context) - Shannon - 1984
463   Term weighting approaches in automatic text retrieval (context) - Salton, Buckley - 1988
233   The CN2 Induction algorithm - Clark, Niblett - 1989
149   Focused crawling: a new approach to topic-specific Web resou.. - Chakrabarti - 1999
139   Machine learning in automated text categorization - Sebastiani - 2002
57   Focused crawling using context graphs - Diligenti - 2000
6   Topic-driven crawlers: Machine learning issues - Menczer - 2002
4   Web classification using support vector machine - Sun, Lim et al. - 2002
3   A large benchmark dataset for web document clustering - Sinka, Corne - 2002
1   Classifying Web Pages Using Adaptive Ontology (context) - Noh - 2003
1   Backprop Package (context) - Tveter - 1996

Documents on the same site (http://faure.isti.cnr.it/~fabrizio/CP2(Google)-pdf.html):   More
Deliverable Identification Sheet - Project Ref No   (Correct)
How Weak Text Categorizers Can Strengthen Performance.. - Uren, Addis (2001)   (Correct)
Low level information extraction: a Bayesian network based.. - Bouckaert   (Correct)

Online articles have much greater impact   More about CiteSeer.IST at NUS   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST at NUS - Copyright Penn State and NEC. Hosted by the School of Computing, National University of Singapore.