Measuring Gender Bias in German Language Generation
Vorschaubild nicht verfügbar
ISSN der Zeitschrift
Trustworthy AI in Science and Society
Gesellschaft für Informatik, Bonn
Most existing methods to measure social bias in natural language generation are specified for English language models. In this work, we developed a German regard classifier based on a newly crowd-sourced dataset. Our model meets the test set accuracy of the original English version. With the classifier, we measured binary gender bias in two large language models. The results indicate a positive bias toward female subjects for a German version of GPT-2 and similar tendencies for GPT-3. Yet, upon qualitative analysis, we found that positive regard partly corresponds to sexist stereotypes. Our findings suggest that the regard classifier should not be used as a single measure but, instead, combined with more qualitative analyses.