Logo des Repositoriums
 
Konferenzbeitrag

Expected utility of content blocks in web content extraction

Lade...
Vorschaubild

Volltext URI

Dokumententyp

Text/Conference Paper

Zusatzinformation

Datum

2006

Zeitschriftentitel

ISSN der Zeitschrift

Bandtitel

Verlag

Gesellschaft für Informatik e.V.

Zusammenfassung

In this paper we discuss the possible application of new concepts in web content extraction: utility assessment, utility annealing, and dynamic aggregated document generation. After analysis of the state of the art in web content extraction, results of a survey study among Polish managers are presented. The discussion covers a web content extraction system with possible extensions that may help tackle the information overload problem. The discussed extensions go beyond current state of the art. Utility assessment considers economical view on value of information, while utility annealing allows for removing content blocks that cover information already acquired from other content blocks. Due to the existing content block extraction technology and new concepts proposed in the paper, it is possible to dynamically generate aggregated documents.

Beschreibung

Kowalkiewicz, Marek (2006): Expected utility of content blocks in web content extraction. Business Information Systems – 9th International Conference on Business Information Systems (BIS 2006). Bonn: Gesellschaft für Informatik e.V.. PISSN: 1617-5468. ISBN: 2-88579-179-X. pp. 342-352. Regular Research Papers. Klagenfurt, Austria. May 31-June 2 2006

Schlagwörter

Zitierform

DOI

Tags