Logo des Repositoriums
 
Zeitschriftenartikel

Towards a word similarity gold standard for Akkadian: creation and model optimization

Vorschaubild nicht verfügbar

Volltext URI

Dokumententyp

Text/Journal Article

Zusatzinformation

Datum

2024

Zeitschriftentitel

ISSN der Zeitschrift

Bandtitel

Verlag

De Gruyter

Zusammenfassung

We present a word similarity gold standard for Akkadian, a language documented in ancient Mesopotamian sources from the 24th century BCE until the first century CE. The gold standard comprises 300 word pairs ranked by their paradigmatic similarity by five independently working Assyriologists. We use the gold standard to tune PMI + SVD and fastText models to improve their performance. We also present a hyper-parametrized PMI + SVD model for building count-based word embeddings, that aims to deal with the data sparsity and repetition issues encountered in Akkadian texts. Our model combines Dirichlet smoothing with context distribution smoothing, and uses context similarity weighting to down-sample distortion caused by formulaic litanies and partially or fully duplicated passages.

Beschreibung

Sahala, Aleksi; Baragli, Beatrice; Lentini, Giulia; Tushingham, Poppy (2024): Towards a word similarity gold standard for Akkadian: creation and model optimization. it - Information Technology: Vol. 66, No. 1. DOI: https://doi.org/10.1515/itit-2023-0065. De Gruyter. ISSN: 2196-7032

Zitierform

Tags