P029 - Natural language processing and information systems 2003
Auflistung P029 - Natural language processing and information systems 2003 nach Titel
1 - 10 von 22
Treffer pro Seite
Sortieroptionen
- KonferenzbeitragAccessing financial news using dialogues(Natural language processing and information systems, 2003) Lan, K. C.; Ho, S.
- KonferenzbeitragApproaches to feature selection for document categorization(Natural language processing and information systems, 2003) Kou, Huaizhong; Gardarin, Georges; Zeitouni, KarinaOne of the problems faced by document categorization is that terms present in the collection of example documents are numerous. From the point of view of coherence between the models used in document categorization, we analyses the frameworks of both k-NN and NB categorization models and feature selection problem. Two algorithms CBA and IBA to feature selection are proposed. The empirical results done with k-NN and NB classifiers show that the coherence between models in the categorization system can bring benefits for performance.
- KonferenzbeitragAssessing the effectiveness of the DAML ontologies for the semantic web(Natural language processing and information systems, 2003) Burton-Jones, Andrew; Storey, Veda C.; Sugumaran, Vijayan; Ahluwalia, PunitThe continued growth of the World Wide Web makes the retrieval of relevant information for a user's query increasingly difficult. Current search engines provide the user with many web pages, but varying levels of relevancy. In response, the Semantic Web has been proposed to retrieve and use more semantic information from the web. Our prior research has developed an intelligent agent to automate the processing of a user's query while taking into account the query's context. The intelligent agent uses WordNet and the DARPA Agent Markup Language (DAML) ontologies to act as surrogates for understanding the context of terms in a user's query. This research develops a set of syntactic, semantic, and pragmatic constructs to assess the effectiveness of the DAML ontologies so that the intelligent agent can select the most useful ontologies. These constructs have been implemented in a tool called the "Ontology Auditor" for use by the intelligent agent.
- KonferenzbeitragExtracting unstructured information from the WWW to support merchant existence in eCommerce(Natural language processing and information systems, 2003) Meziane, Farid; Kasiran, Mohd Khairudin
- KonferenzbeitragFrom scenarios to KCPM dynamic schemas: Aspects of automatic mapping(Natural language processing and information systems, 2003) Fliedl, Günther; Kop, Christian; Mayr, Heinrich C.Scenarios are a very popular means for describing and analyzing behavioral aspects on the level of natural language. In information systems design, they form the basis for a subsequent step of conceptual dynamic modeling. To enhance this step, linguistic instruments prove applicable for transforming scenarios into conceptual schemas of various models. This transformation usually consists of three steps: linguistic analysis, component mapping and schema construction. Within this paper we investigate to which extent these steps may be performed automatically in the framework of KCPM, a conceptual predesign model which is used as an Interlingua between natural language and arbitrary conceptual models.
- KonferenzbeitragA general proposal to multilingual information access based on syntactic-semantic patterns(Natural language processing and information systems, 2003) Navarro, Borja; Palomar, Manuel; Martínez-Barco, PatricioIn the last few years, there has been a wide development in the research on textual information systems. The goal is to improve these systems in order to allow an easy localization, treatment and access to the information stored in digital format (Digital Databases, Documental Databases, and so on). There are lots of applications focused on information access (for example, Web-search systems like Google or Altavista). However, these applications have problems when they must access to cross-language information, or when they need to show information in a language different from the one of the query. This paper explores the use of syntactic-sematic patterns as a method to access to multilingual information, and revise, in the case of Information Retrieval, where it is possible and useful to employ patterns when it comes to the multilingual and interactive aspects. On the one hand, the multilingual aspects that are going to be studied are the ones related to the access to documents in different languages from the one of the query, as well as the automatic translation of the document, i.e. a machine translation system based on patterns. On the other hand, this paper is going to go deep into the interactive aspects related to the reformulation of a query based on the syntactic-semantic pattern of the request.
- KonferenzbeitragImproving the efficacy of approximate searching by personal-name(Natural language processing and information systems, 2003) Camps, Rafael; Daudé, JordiWe discuss the design and evaluation of a method to find the information of a person, using his/her name as a search key, even if it has deformations. We present a similarity function that is an edit distance function with costs based on the probabilities of the edit operations but depending on the involved letters and their position. The distance threshold varies with the length of the searched name. The evaluation of the efficacy of approximate matching methods is usually done by subjective relevance judgements. An objective comparison of five methods, reveals that the proposed function highly improves the efficacy: for a recall of 94%, a fallout of 0.2% is obtained.
- KonferenzbeitragInformation extraction for reorganizing specifications(Natural language processing and information systems, 2003) Thirunarayan, Krishnaprasad; Berkovich, Aaron; Grace, Steve; Sokol, Dan
- KonferenzbeitragKeyword extraction for text characterization(Natural language processing and information systems, 2003) Renz, Ingrid; Ficzay, Andrea; Hitzler, HolgerKeywords are valuable means for characterizing texts. In order to extract keywords we propose an efficient and robust, languageand domainindependent approach which is based on small word parts (quadgrams). The basic algorithm can be improved by re-examining and re-ranking keywords using edit distance (i.e. Levenshtein distance) and an algorithm based on the relativistic addition of velocities (here: weights). For the purpose of evaluation, we compare our approach to frequency-based keyword extraction (exemplary text collection: 45000 intranet documents in German and English).
- KonferenzbeitragLinguistic resource for NLP: Ask for "Die drei Musketiere" and meet ""Les trois mousquetaires"(Natural language processing and information systems, 2003) Piton, Odile; Grass, Thierry; Maurel, DenisOur work concerns proper names or "named entities" in NLP, in a multilingual context and in the spirit of the action "Technolangue" launched in France by the Ministry for Research, the Ministry for the Culture and the Communication and the Ministry for the Economy, Finances and Industry in 2002, with the purpose of production, validation and diffusion of linguistic resources. Our objective is to create a multilingual electronic dictionary intended to record such terms in several languages as well as the bonds between them, and We know that toponyms follow strict rules imposed by international or national organizations (UNGEGN) intended to standardize them. This is not true for the other categories of proper names. In order to create an operational tool that is able to help translation, we must study the proper names, their grammar and the semantic relations between them. We propose ontology based on a taxonomy inspired by the works of Bauer and others. We took as a starting point the work of Mel'cuk on the lexical functions as well as work of G. Miller on the WorldNet system. The basic model includes semantic or lexical relations. These relations make it possible to locate a proper name in a lexical network and to provide responses on request.
- «
- 1 (current)
- 2
- 3
- »