Holz, FlorianWitschel, Hans-FriedrichHeinrich, GregorHeyer, GerhardTeresniak, SvenEichler, GeraldKropf, PeterLechner, UlrikeMeesad, PhayungUnger, Herwig2019-01-112019-01-112010978-3-88579-259-8https://dl.gi.de/handle/20.500.12116/19022We address the problem of evaluating peer-to-peer information retrieval (P2PIR) systems with semantic overlay structure. The P2PIR community lacks a commonly accepted testbed, such as TREC is for the classic IR community. The problem with using classic test collections in a P2P scenario is that they provide no realistic distribution of documents and queries over peers, which is, however, crucial for realistically simulating and evaluating semantic overlays. On the other hand, document collections that can be easily distributed (e.g. by exploiting categories or author information) lack both queries and relevance judgments. Therefore, we propose an evaluation framework, which provides a strategy for constructing a P2PIR testbed, consisting of a prescription for content distribution, query generation and measuring effectiveness without the need for human relevance judgments. It can be used with any document collection that contains author information and document relatedness (e.g. references in scientific literature). Author information is used for assigning documents to peers, relatedness is used for generating queries from related documents. The ranking produced by the P2PIR system is evaluated by comparing it to the ranking of a centralised IR system using a new evaluation measure related to mean average precision. The combination of these three things – realistic content distribution, realistic and automated query generation and distribution, and a meaningful and flexible evaluation measure for rankings – offers an improvement over existing P2PIR evaluation approaches.enAn evaluation framework for semantic search in P2P networksText/Conference Paper1617-5468