Konferenzbeitrag
Statistical detection of co-operative transcription factors with similarity adjustment
Lade...
Volltext URI
Dokumententyp
Text/Conference Paper
Dateien
Zusatzinformation
Datum
2008
Autor:innen
Zeitschriftentitel
ISSN der Zeitschrift
Bandtitel
Verlag
Gesellschaft für Informatik e. V.
Zusammenfassung
Statistical assessment of cis-regulatory modules (CRMs) is a crucial task in computational biology. Usually, one concludes from exceptional co-occurrences of DNA motifs that the corresponding transcription factors are co-operative. However, similar DNA motifs tend to co-occur in random sequences due to high probability of overlapping occurrences. Therefore, it is important to consider similarity of DNA motifs in the statistical assessment. Based on previous work, we propose to adjust the window size for co-occurrence detection. Using the derived approximation, one obtains different window sizes for different sets of DNA motifs depending on their similarities. This ensures that the probability of co-occurrences in random sequences are equal. Applying the approach to selected similar and dissimilar DNA motifs from human transcription factors shows the necessity of adjustment and confirms the accu- racy of the approximation.
Our previously published statistics can only deal with non-overlapping windows. Therefore, we extend the approach and derive Chen-Stein error bounds for the approxi- mation. Comparing the error bounds for similar and dissimilar DNA motifs shows that the approximation for similar DNA motifs yields large bounds. Hence, one has to be careful using overlapping windows. Based on the error bounds, one can pre-compute the approximation errors and select an appropriate overlap-scheme before running the analysis. Software and source code are available at http://mosta.molgen.mpg.de.