Stemming strategies for European languages
ISSN der Zeitschrift
10th International Conferenceon Innovative Internet Community Systems (I2CS) – Jubilee Edition 2010 –
Regular Research Papers
Gesellschaft für Informatik e.V.
In this paper, we describe and evaluate different general stemming approaches for the French, Portuguese (Brazilian), German and Hungarian languages. Based on the CLEF test-collections, we demonstrate that light stemming approaches are quite effective for the French, Portuguese and Hungarian languages, and perform reasonably well for the German language. Variations in mean average precision among the different stemmers are also evaluated and are sometimes found to be statistically significant.