Auflistung nach Schlagwort "Similarity measures"
1 - 2 von 2
Treffer pro Seite
Sortieroptionen
- TextdokumentThe Best of Both Worlds: Combining Hand-Tuned and Word-Embedding-Based Similarity Measures for Entity Resolution(BTW 2019, 2019) Chen, Xiao; Campero Durand, Gabriel; Zoun, Roman; Broneske, David; Li, Yang; Saake, GunterRecently word embedding has become a beneficial technique for diverse natural language processing tasks, especially after the successful introduction of several popular neural word embedding models, such as word2vec, GloVe, and FastText. Also entity resolution, i.e., the task of identifying digital records that refer to the same real-world entity, has been shown to benefit from word embedding. However, the use of word embeddings does not lead to a one-size-fits-all solution, because it cannot provide an accurate result for those values without any semantic meaning, such as numerical values. In this paper, we propose to use the combination of general word embedding with traditional hand-picked similarity measures for solving ER tasks, which aims to select the most suitable similarity measure for each attribute based on its property. We provide some guidelines on how to choose suitable similarity measures for different types of attributes and evaluate our proposed hybrid method on both synthetic and real datasets. Experiments show that a hybrid method reliant on correctly selecting required similarity measures can outperform the method of purely adopting traditional or word-embedding-based similarity measures.
- ZeitschriftenartikelQuantitative Methods for Similarity in Description Logics(KI - Künstliche Intelligenz: Vol. 31, No. 1, 2017) Ecke, AndreasDescription logics (DLs) are a family of logic-based knowledge representation languages used to describe the knowledge of an application domain and reason about it in a formally well-defined way. However, all classical DLs have in common that they can only express exact knowledge, and correspondingly only allow exact inferences. In practice though, knowledge is rarely exact. Many definitions have exceptions or are vaguely formulated in the first place, and people might not only be interested in exact answers, but also in alternatives that are “close enough”. We are interested in tackling how to express that something is “close enough”, and how to integrate this notion into the formalism of DLs. To this end we employ the notion of similarity and dissimilarity measures, we will look at how useful measures can be defined in the context of DLs and two particular applications: Relaxed instance queries will use a similarity measure in order to not just give the exact answer to some query, but all answers that are reasonably similar. Prototypical definitions on the other hand use a measure of dissimilarity or distance between concepts in order to allow the definitions of and reasoning with concepts that capture not just those individuals that satisfy exactly the stated properties, but also those that are “close enough”.