Konferenzbeitrag
Using neighborhood information for automated categorization of web pages
Lade...
Volltext URI
Dokumententyp
Text/Conference Paper
Zusatzinformation
Datum
2003
Autor:innen
Zeitschriftentitel
ISSN der Zeitschrift
Bandtitel
Quelle
Verlag
Gesellschaft für Informatik e.V.
Zusammenfassung
In this paper we discuss several issues related to the influence of expansion of a Web document representation on quality of topical categorization of Web pages. We consider a Web page expansion by using text content of it's linking pages. We show that naive expansion can grab too much noise and essentially harm categorization results. We present the approach to automated pruning of linking Web pages. We report that using our approach in forming a Web page representation always leads to better results than traditional single Web page categorization.