Towards standardized vectorial resource descriptors on the web
ISSN der Zeitschrift
INFORMATIK 2010. Service Science – Neue Perspektiven für die Informatik. Band 2
Regular Research Papers
Gesellschaft für Informatik e.V.
Resources with quantitative properties, e.g. measurable resources or sources for feature extraction (e.g. fingerprints), play an important role, particularly in scientific areas such as Life Sciences, the medical domain and nature sciences. In this paper we propose similarity-based representation of resources using so called Vectorial Resource Descriptors (VRDs) on the Web. The VRDs are standardized data structures which build the basis of Vectorial Web Search. Every VRD contains a feature vector and a Vector Space Identifier (VSI), and further data. In contrast to conventional keyword search, which requires matching of free text, Vectorial Web Search is well defined similarity search of numeric data. Users provide a VRD, or only the searched numeric data (i.e. the feature vector, as sequence of numbers) together with the VSI. The VSI is a HTTP URI which identifies the vector space of the feature vector, and which points to a standardized Vector Space Descriptor (VSD). So the valid distance function and the meaning of every dimension (number) of the feature vector is known by the system. For quantification of similarity the (in the VSD specified) distance function of the chosen vector space is used. The smaller the distance, the greater is the similarity of a VRD, and the higher is its rank in the search result.