Logo des Repositoriums

Web data extraction for business intelligence: the lixto approach


Knowledge about market developments and competitor activities on the market becomes more and more a critical success factor for enterprises. The World Wide Web provides public domain information which can be retrieved for example from Web sites or online shops. The extraction from semi-structured information sources is mostly done manually and is therefore very time consuming. This paper describes how public information can be extracted automatically from Web sites, transformed into structured data formats, and used for data analysis in Business Intelligence systems.


Baumgartner, Robert; Fröhlich, Oliver; Gottlob, Georg; Harz, Patrick; Herzog, Marcus; Lehmann, Peter (2005): Web data extraction for business intelligence: the lixto approach. Datenbanksysteme in Business, Technologie und Web, 11. Fachtagung des GIFachbereichs “Datenbanken und Informationssysteme” (DBIS). Bonn: Gesellschaft für Informatik e.V.. PISSN: 1617-5468. ISBN: 3-88579-394-6. pp. 48-65. Regular Research Papers. Karlsruhe. 2.-4. März 2005