Konferenzbeitrag
MPI-ClustDB: A fast string matching strategy utilizing parallel computing
Lade...
Volltext URI
Dokumententyp
Text/Conference Paper
Dateien
Zusatzinformation
Datum
2006
Autor:innen
Zeitschriftentitel
ISSN der Zeitschrift
Bandtitel
Verlag
Gesellschaft für Informatik e.V.
Zusammenfassung
ClustDB is a tool for the identification of perfect matches in large sets of sequences. It is faster and can handle at least 8 times more data than VMATCH, the most memory efficient exact program currently available. Still ClustDB needs about four hours to compare all Human ESTs. We therefore present a distributed and parallel implementation of ClustDB to reduce the execution time. It uses a message-passing library called MPI and runs on distributed workstation clusters with significant runtime savings. MPI-ClustDB is written in ANSI C and freely available on request from the authors.