P029 - Natural language processing and information systems 2003
Auflistung P029 - Natural language processing and information systems 2003 nach Erscheinungsdatum
1 - 10 von 22
Treffer pro Seite
Sortieroptionen
- KonferenzbeitragSpeech input and output technology - State of the art and selected applications(Natural language processing and information systems, 2003) Fellbaum, Klaus-RüdigerThis paper deals with speech communication aspects with an accent on speech dialogues, in which one communication partner is a computer. After some basic definitions, we discuss the speech components of a man-machine dialogue, namely speech recognition and speech synthesis. It will be shown that a computer is far away from human's ability to recognise or understand speech.
- KonferenzbeitragAssessing the effectiveness of the DAML ontologies for the semantic web(Natural language processing and information systems, 2003) Burton-Jones, Andrew; Storey, Veda C.; Sugumaran, Vijayan; Ahluwalia, PunitThe continued growth of the World Wide Web makes the retrieval of relevant information for a user's query increasingly difficult. Current search engines provide the user with many web pages, but varying levels of relevancy. In response, the Semantic Web has been proposed to retrieve and use more semantic information from the web. Our prior research has developed an intelligent agent to automate the processing of a user's query while taking into account the query's context. The intelligent agent uses WordNet and the DARPA Agent Markup Language (DAML) ontologies to act as surrogates for understanding the context of terms in a user's query. This research develops a set of syntactic, semantic, and pragmatic constructs to assess the effectiveness of the DAML ontologies so that the intelligent agent can select the most useful ontologies. These constructs have been implemented in a tool called the "Ontology Auditor" for use by the intelligent agent.
- KonferenzbeitragFrom scenarios to KCPM dynamic schemas: Aspects of automatic mapping(Natural language processing and information systems, 2003) Fliedl, Günther; Kop, Christian; Mayr, Heinrich C.Scenarios are a very popular means for describing and analyzing behavioral aspects on the level of natural language. In information systems design, they form the basis for a subsequent step of conceptual dynamic modeling. To enhance this step, linguistic instruments prove applicable for transforming scenarios into conceptual schemas of various models. This transformation usually consists of three steps: linguistic analysis, component mapping and schema construction. Within this paper we investigate to which extent these steps may be performed automatically in the framework of KCPM, a conceptual predesign model which is used as an Interlingua between natural language and arbitrary conceptual models.
- KonferenzbeitragImproving the efficacy of approximate searching by personal-name(Natural language processing and information systems, 2003) Camps, Rafael; Daudé, JordiWe discuss the design and evaluation of a method to find the information of a person, using his/her name as a search key, even if it has deformations. We present a similarity function that is an edit distance function with costs based on the probabilities of the edit operations but depending on the involved letters and their position. The distance threshold varies with the length of the searched name. The evaluation of the efficacy of approximate matching methods is usually done by subjective relevance judgements. An objective comparison of five methods, reveals that the proposed function highly improves the efficacy: for a recall of 94%, a fallout of 0.2% is obtained.
- KonferenzbeitragOntology-driven discourse analysis in GenIE(Natural language processing and information systems, 2003) Cimiano, Philip
- KonferenzbeitragApproaches to feature selection for document categorization(Natural language processing and information systems, 2003) Kou, Huaizhong; Gardarin, Georges; Zeitouni, KarinaOne of the problems faced by document categorization is that terms present in the collection of example documents are numerous. From the point of view of coherence between the models used in document categorization, we analyses the frameworks of both k-NN and NB categorization models and feature selection problem. Two algorithms CBA and IBA to feature selection are proposed. The empirical results done with k-NN and NB classifiers show that the coherence between models in the categorization system can bring benefits for performance.
- KonferenzbeitragKeyword extraction for text characterization(Natural language processing and information systems, 2003) Renz, Ingrid; Ficzay, Andrea; Hitzler, HolgerKeywords are valuable means for characterizing texts. In order to extract keywords we propose an efficient and robust, languageand domainindependent approach which is based on small word parts (quadgrams). The basic algorithm can be improved by re-examining and re-ranking keywords using edit distance (i.e. Levenshtein distance) and an algorithm based on the relativistic addition of velocities (here: weights). For the purpose of evaluation, we compare our approach to frequency-based keyword extraction (exemplary text collection: 45000 intranet documents in German and English).
- KonferenzbeitragA model of USENET newsgroups dynamics: Implementation and results(Natural language processing and information systems, 2003) Martinovic, M.; Sampath, G.; Wagner, R.; Briening, S.We present an implementation of the multilevel text processing model of discussions in USENET groups proposed earlier (NLDB '02 Proceedings). In the statistical processing phase, a discussion thread is SGML tagged to include the relevant information about parent-child relationships among the postings as well as other meta data of postings and threads. This tagged output is then processed by a generic information retrieval system. Various relevant metrics that measure properties of discussions (such as thread focus, relevance of posting, discussion density, etc.) are defined and computed. The subsequent semantic component (utilizing tools like electronic lexicons, POS taggers and parsers) has been implemented to work in a modular fashion to allow inclusion or exclusion of some of its subcomponents. The user may tune this module to its minimal level to process semantics of individual words only, or up to its maximal level to include words with their full contexts. We also present evaluation data assessing the performance of the system with or without some of its modules.
- KonferenzbeitragAccessing financial news using dialogues(Natural language processing and information systems, 2003) Lan, K. C.; Ho, S.
- KonferenzbeitragOn detection of malapropisms by multistage collocation testing(Natural language processing and information systems, 2003) Bolshakov, Igor A.; Gelbukh, AlexanderMalapropism is a (real-word) error in a text consisting in unintended replacement of one content word by another existing content word similar in sound but semantically incompatible with the context and thus destructing text cohesion, e.g.: they travel around the word. We present an algorithm of malapropism detection and correction based on evaluating the cohesion. As a measure of semantic compatibility of words we consider their ability to form syntactically linked and semantically admissible word combinations (collocations), e.g: travel (around the) world. With this, text cohesion at a content word is measured as the number of collocations it forms with the words in its immediate context. We detect malapropisms as words forming no collocations in the context. To test whether two words can form a collocation, we consider two types of resources: a collocation DB and an Internet search engine, e.g., Google. We illustrate the proposed method by classifying, tracing, and evaluating several English malapropisms.
- «
- 1 (current)
- 2
- 3
- »