Logo des Repositoriums
 

it - Information Technology 66(1) - November 2024

Autor*innen mit den meisten Dokumenten  

Auflistung nach:

Neueste Veröffentlichungen

1 - 5 von 5
  • Zeitschriftenartikel
    Frontmatter
    (it - Information Technology: Vol. 66, No. 1, 2024) Frontmatter
  • Zeitschriftenartikel
    Stylistic classification of cuneiform signs using convolutional neural networks
    (it - Information Technology: Vol. 66, No. 1, 2024) Yugay, Vasiliy; Paliwal, Kartik; Cobanoglu, Yunus; Sáenz, Luis; Gogokhia, Ekaterine; Gordin, Shai; Jiménez, Enrique
    The classification of cuneiform signs according to stylistic criteria is a difficult task, which often leaves experts in the field disagree. This study introduces a new publicly available dataset of cuneiform signs classified according to style and Convolutional Neural Network (CNN) approaches to differentiate between cuneiform signs of the two main styles of the first millennium bce, Neo-Assyrian and Neo-Babylonian. The CNN model reaches an accuracy of 83 % in style classification. This tool has potential implications for the recognition of individual scribes and the dating of undated cuneiform tablets.
  • Zeitschriftenartikel
    Towards a word similarity gold standard for Akkadian: creation and model optimization
    (it - Information Technology: Vol. 66, No. 1, 2024) Sahala, Aleksi; Baragli, Beatrice; Lentini, Giulia; Tushingham, Poppy
    We present a word similarity gold standard for Akkadian, a language documented in ancient Mesopotamian sources from the 24th century BCE until the first century CE. The gold standard comprises 300 word pairs ranked by their paradigmatic similarity by five independently working Assyriologists. We use the gold standard to tune PMI + SVD and fastText models to improve their performance. We also present a hyper-parametrized PMI + SVD model for building count-based word embeddings, that aims to deal with the data sparsity and repetition issues encountered in Akkadian texts. Our model combines Dirichlet smoothing with context distribution smoothing, and uses context similarity weighting to down-sample distortion caused by formulaic litanies and partially or fully duplicated passages.
  • Zeitschriftenartikel
    Insights into digital ancient near Eastern studies: introduction
    (it - Information Technology: Vol. 66, No. 1, 2024) Gordin, Shai; Béranger, Marine; Homburg, Timo; Mara, Hubert
  • Zeitschriftenartikel
    Sign detection for cuneiform tablets
    (it - Information Technology: Vol. 66, No. 1, 2024) Cobanoglu, Yunus; Sáenz, Luis; Khait, Ilya; Jiménez, Enrique
    Among the many excavated cuneiform tablets, only a small portion has been analyzed by Assyriologists. Learning how to read cuneiform is a lengthy and challenging process that can take years to complete. This work aims to improve the automatic detection of cuneiform signs from 2D images of cuneiform tablets. The results can later be used for NLP tasks such as semantic annotation, word alignment and machine translation to assist Assyriologists in their research. We introduce the largest publicly available annotated dataset of cuneiform signs to date. It comprises of 52,102 signs from 315 fully annotated tablets, equating to 512 distinct images. In addition, we have preprocessed and refined four existing datasets, resulting in a comprehensive collection of 88,536 signs. Since some signs are not localized on fully annotated tablets, the total dataset encompasses 593 fully annotated cuneiform tablets, resulting in 654 images. Our efforts to expand this dataset are ongoing. Furthermore, we evaluate two state-of-the-art methods to establish benchmarks in the field. The first is a two-stage supervised sign detection approach that involves: (1) the identification of bounding boxes, and (2) the classification of each sign within these boxes. The second method employs an object detection model. Given the numerous classes and their varied distribution, the task of cuneiform sign detection poses a significant challenge in machine learning. This paper aims to lay a groundwork for future research, offering both a substantial dataset and initial methodologies for sign detection on cuneiform tablets.