Superhelically destabilized sites in the E. Coli genome: Implications for promoter prediction in prokaryotes
ISSN der Zeitschrift
German Conference on Bioinformatics 2004, GCB 2004
Regular Research Papers
Gesellschaft für Informatik e.V.
Stress-induced DNA duplex destabilization (SIDD) analysis exploits the known structural and energetic properties of DNA to predict which sites become susceptible to strand separation under superhelical stress. Experimental results show that this analysis is quantitatively accurate in predicting destabilized sites that occur in transcriptional regulatory regions, matrix/scaffold attachment sites and replication origins. Here we report the results of a SIDD analysis of the complete E. coli genome, performed using a new algorithm specific for long genomic DNA sequences. Our results demonstrate that less than 7% of the E. coli genome has the propensity to become highly destabilized at physiological superhelical densities. Those SIDD sites with high destabilization potential are statistically significantly associated with divergent and tandem intergenic regions, but not with convergent intergenic or coding regions. More than 80% of the intergenic regions containing experimentally characterized promoters are found to overlap these SIDD sites. Strong SIDD sites are highly enriched in the 5' upstream regions of genes regulating stress responses in E. coli, suggesting a possible functional role in their regulation. We discuss the possibility of using SIDD properties for promoter prediction, and potential roles of predicted SIDD sites in transcriptional regulation.