BLiCThis site supports a paper published in PLoS computational Biology on 29 Feb 2008
IntroductionIdentifying DNA binding sites of transcription factors:
Applying motif discovery algorithms to a group of related DNA sequences leads to the identification of putative transcription factor DNA binding sites. These algorithms output a set of DNA motifs, which are frequently redundant. To infer the correct transcription regulation map from the discovered motif set, it is crucial to reduce this redundancy and to relate the newly discovered motifs to known ones.
Reducing redundancy by clustering and merging motifs:
A redundant set of DNA motifs can be reduced by clustering the motifs into groups of related ones and merging the motifs within each cluster. In this example, a redundant set of 16 DNA motifs (a partial output of several motif search algorithms) is clustered and merged to a final set consisting of three DNA motifs.
Identifying the binding factors of DNA motifs:
The transcription factors that bind the newly discovered DNA motifs can be revealed based on similarities to previously defined motifs. In this example, comparison of a newly discovered motif to four known motifs reveals high similarity to the known Gcn4 binding motif. From this comparison the transcription factor that binds the motif is identified with high probability.