Logo des Repositoriums
 

Zur Methode automatisierter Kategorisierungsverfahren der Internetsuchmaschine des Bayern-Portals

dc.contributor.authorWeihs, Erich
dc.contributor.editorKnetsch, Gerlinde
dc.contributor.editorJessen, Karin
dc.contributor.editorBeckmann, Ines
dc.date.accessioned2019-09-20T12:10:09Z
dc.date.available2019-09-20T12:10:09Z
dc.date.issued2008
dc.description.abstractSummary For the Bavaria portal, www.bayern.de, a "find machine" is used which covers the data space of public administration and closely related institutions, such as chambers of commerce, universities, employment offices, etc. What makes the find machine special is that it divides up the results lists according to life event. The life events are based on an agreement of a working group of the ministers of the interior at the federal and state levels. The life events are based on the gender mainstreaming approach, which categorizes by requirements and not in relation to a specific organization or a specific field. Overall, 650 life events were used for classification, divided up into citizen life events and corporate life events. The search engine of the Bavaria portal uses new technological methods to develop and process the public web-offerings (government, authorities, municipalities, universities etc.) according to the life events of citizens. The find machine of the Bavaria portal can beside semantic classification categorize automatically using the mathematic-statistic support vector method (SVM). The advantage of the SVM method is enhanced stability of the results when the data space changes over time. Since the approach evaluates the distributions of the word space, it does not depend on the construct of a semantic definition, but instead only on the probability of its appearing. Another method which is increasingly being employed is to use a meta tag related to life event and place which enables a direct classification. The search space of the Bavarian search engine is limited by official offerings, municipalities, institutions of higher learning, employment offices, chambers of commerce, etc. In certain cases, additional institutions such as churches are included in order to take account of the regional supply of kindergartens, for instance. On the whole, approx. 6000 URLs are tracked, with approx. 2½ million indexed sites. The data material is extremely varied. Along with html pages, it also reads and completely indexes pdf and doc formats, among others. This means that such things as municipal bulletins and other reports consisting of several hundred pages are collected and completely indexed. While reports can sometimes be thematically circumscribed section by section, this is often not the case with bulletins such as community magazines, etc. Here it is frequently a question of a chronological or editorial mixture of news, reports and statements, whose content cannot be assigned in detail to the life event classifications.de
dc.description.urihttp://www.ak-uis.de/download/Abschlussberichte/2008-Dessau-3694.pdf#page=17de
dc.identifier.urihttps://dl.gi.de/handle/20.500.12116/27813
dc.publisherUmweltbundesamt http://www.umweltbundesamt.de
dc.relation.ispartofWorkshop des Arbeitskreises „Umweltdatenbanken / Umweltinformationssysteme“ der Fachgruppe „Informatik im Umweltschutz“
dc.relation.ispartofseriesWorkshops "AK Umweltinformationssysteme"
dc.titleZur Methode automatisierter Kategorisierungsverfahren der Internetsuchmaschine des Bayern-Portalsde
dc.typeText/Conference Paper
gi.citation.publisherPlaceDessau-Roßlau
gi.conference.date2008
gi.conference.locationDessau
gi.conference.sessiontitleUmweltinformationssysteme: Suchmaschinen und Wissensmanagement - Methoden und Instrumente

Dateien