GI LogoGI Logo
  • Anmelden
Digitale Bibliothek
    • Gesamter Bestand

      • Bereiche & Sammlungen
      • Titel
      • Autor
      • Erscheinungsdatum
      • Schlagwort
    • Diese Sammlung

      • Titel
      • Autor
      • Erscheinungsdatum
      • Schlagwort
Digital Bibliothek der Gesellschaft für Informatik e.V.
GI-DL
    • English
    • Deutsch
  • Deutsch 
    • English
    • Deutsch
Dokumentanzeige 
  •   Startseite
  • Lecture Notes in Informatics
  • Proceedings
  • German Conference on Bioinformatics
  • P115 - GCB 2007 - German Conference on Bioinformatics 2007
  • Dokumentanzeige
JavaScript is disabled for your browser. Some features of this site may not work without it.
  •   Startseite
  • Lecture Notes in Informatics
  • Proceedings
  • German Conference on Bioinformatics
  • P115 - GCB 2007 - German Conference on Bioinformatics 2007
  • Dokumentanzeige

A general paradigm for fast, adaptive clustering of biological sequences

Autor(en):
Reinert, Knut [DBLP] ;
Bauer, Markus [DBLP] ;
Döring, Andreas [DBLP] ;
Klau, Gunnar W. [DBLP] ;
Halpern, Aaron L. [DBLP]
Zusammenfassung
There are numerous methods that compute clusterings of biological sequences based on pairwise distances. This necessitates the computation of O(n2) sequence comparisons. Users usually want to apply the most sensitive distance measure which normally is the most expensive in terms of runtime. This poses a problem if the number of sequences is large or the computation of the measure is slow. In this paper we present a general heuristic to speed up distance based clustering methods considerably while compromising little on the accuracy of the results. The speedup comes from using fast comparison methods to perform an initial ‘top-down’ split into relatively homogeneous clusters, while the slower measures are used for smaller groups. Then profiles are computed for the final groups and the resulting profiles are used in a bottom-up phase to compute the final clustering. The algorithm is general in the sense that any sequence comparison method can be employed (e.g. for DNA, RNA or amino acids). We test our algorithm using a prototypical imple- mentation for agglomerative RNA clustering and show its effectiveness.
  • Vollständige Referenz
  • BibTeX
Reinert, K., Bauer, M., Döring, A., Klau, G. W. & Halpern, A. L., (2007). A general paradigm for fast, adaptive clustering of biological sequences. In: Falter, C., Schliep, A., Selbig, J., Vingron, M. & Walther, D. (Hrsg.), German conference on bioinformatics – GCB 2007. Bonn: Gesellschaft für Informatik e. V.. (S. 15-29).
@inproceedings{mci/Reinert2007,
author = {Reinert, Knut AND Bauer, Markus AND Döring, Andreas AND Klau, Gunnar W. AND Halpern, Aaron L.},
title = {A general paradigm for fast, adaptive clustering of biological sequences},
booktitle = {German conference on bioinformatics – GCB 2007},
year = {2007},
editor = {Falter, Claudia AND Schliep, Alexander AND Selbig, Joachim AND Vingron, Martin AND Walther, Dirk} ,
pages = { 15-29 },
publisher = {Gesellschaft für Informatik e. V.},
address = {Bonn}
}
DateienGroesseFormatAnzeige
15.pdf615.4Kb PDF Öffnen

Haben Sie fehlerhafte Angaben entdeckt? Sagen Sie uns Bescheid: Feedback abschicken

Mehr Information

ISBN: 978-3-88579-209-3
ISSN: 1617-5468
Datum: 2007
Sprache: en (en)
Typ: Text/Conference Paper
Sammlungen
  • P115 - GCB 2007 - German Conference on Bioinformatics 2007 [20]

Zur Langanzeige


Über uns | FAQ | Hilfe | Impressum | Datenschutz

Gesellschaft für Informatik e.V. (GI), Kontakt: Geschäftsstelle der GI
Diese Digital Library basiert auf DSpace.

 

 


Über uns | FAQ | Hilfe | Impressum | Datenschutz

Gesellschaft für Informatik e.V. (GI), Kontakt: Geschäftsstelle der GI
Diese Digital Library basiert auf DSpace.