SEWISE : An ontology-based web information search engine
ISSN der Zeitschrift
Gesellschaft für Informatik e.V.
Since the begin of the 90's, the World Wide Web (WWW) rapidly guides the world into a newly amazing electronic village, where everybody can publish everything in electronic form and find almost all required information. The volume of available information is increasing exponentially in different formats, 80% being text. It remains hard to find interesting information directly from Web sources. SEWISE is an ontology-based Web information system to support Web information description and retrieval. According to domain ontology, SEWISE can map text information from various Web sources into one uniform XML structure and make hidden semantic in text accessible to program. The textual information of interest is automatically extracted by Web Wrappers from various Web sources and then text mining techniques such as categorization and summarization are used to process retrieved text information. Finally, text descriptions are built in XML format that can be directly queried. SEWISE provides support for topic-centric Web information search. The SEWISE prototype is implemented and has been experimented using French financial Web news from several popular sites.