Auflistung nach Autor:in "Starlinger, Johannes"
1 - 3 von 3
Treffer pro Seite
Sortieroptionen
- KonferenzbeitragExperiences from developing the domain-specific entity search engine GeneView(Datenbanksysteme für Business, Technologie und Web (BTW) 2027, 2013) Thomas, Philippe; Starlinger, Johannes; Leser, UlfGeneView is a semantic search engine for the Life Sciences. Unlike traditional search engines, GeneView analyzes texts upon import to recognize and properly handle biomedical entities, relationships between those entities, and the structure of documents. This allows for a number of advanced features required to work effectively with scientific texts, such as entity disambiguation, ranking of documents by entity content, linking to structured knowledge about entities, userfriendly highlighting of entities etc. As of now, GeneView indexes approximately ~21,4M abstracts and ~358K full texts with more than 200M entities of 11 different types and more than 100K relationships. In this paper, we describe the architecture underlying the system with a focus on the complex pipeline of advanced NLP and information extraction tools necessary for achieving the above functionality. We also discuss open challenges in developing and maintaining a semantic search engine over a large (though not web-scale) corpus.
- ZeitschriftenartikelHow to improve information extraction from German medical records(it - Information Technology: Vol. 59, No. 5, 2017) Starlinger, Johannes; Kittner, Madeleine; Blankenstein, Oliver; Leser, UlfVast amounts of medical information are still recorded as unstructured text. The knowledge contained in this textual data has a great potential to improve clinical routine care, to support clinical research, and to advance personalization of medicine. To access this knowledge, the underlying data has to be semantically integrated – an essential prerequisite to which is information extraction from clinical documents.
- KonferenzbeitragPiPa: custom integration of protein interactions and pathways(INFORMATIK 2011 – Informatik schafft Communities, 2011) Arzt, Sebastian; Starlinger, Johannes; Arnold, Oliver; Kröger, Stefan; Jaeger, Samira; Leser, UlfInformation about proteins and their relationships to each other are a common source of input for many areas of Systems Biology, such as protein function prediction, relevance-ranking of disease genes and simulation of biological networks. While there are numerous databases that focus on collecting such data from, for instance, literature curation, expert knowledge, or experimental studies, their individual coverage is often low, making the building of an integrated protein-protein interaction database a pressing need. Accordingly, a number of such systems have emerged. But in most cases their content is only accessible over the web on a per-protein basis, which renders them useless for automatic analysis of sets of proteins. Even if the databases are available for download, often certain data sources are missing (e.g. because redistribution is forbidden by license), and update intervals are sporadic. We present PiPa, a system for the integration of protein-protein interactions (PPI) and pathway data. PiPa is a stand-alone tool for loading and updating a large number of common PPI and pathway databases into a homogeneously structured relational database. PiPa features a graphical administration tool for monitoring its state, triggering updates, and for computing statistics on the content. Due to its modular architecture, addition of new data sources is easy. The software is freely available from the authors.