Fatma A. Hashim, Mai S. Mabrouk* and Walid A.L. Atabany Pages 4 - 26 ( 23 )
Background: Bioinformatics is an interdisciplinary field that combines biology and information technology to study how to deal with the biological data. The DNA motif discovery problem is the main challenge of genome biology and its importance is directly proportional to increasing sequencing technologies which produce large amounts of data. DNA motif is a repeated portion of DNA sequences of major biological interest with important structural and functional features. Motif discovery plays a vital role in the antibody-biomarker identification which is useful for diagnosis of disease and to identify Transcription Factor Binding Sites (TFBSs) that help in learning the mechanisms for regulation of gene expression. Recently, scientists discovered that the TFs have a mutation rate five times higher than the flanking sequences, so motif discovery also has a crucial role in cancer discovery.
Methods: Over the past decades, many attempts use different algorithms to design fast and accurate motif discovery tools. These algorithms are generally classified into consensus or probabilistic approach.
Results: Many of DNA motif discovery algorithms are time-consuming and easily trapped in a local optimum.
Conclusion: Nature-inspired algorithms and many of combinatorial algorithms are recently proposed to overcome the problems of consensus and probabilistic approaches. This paper presents a general classification of motif discovery algorithms with new sub-categories. It also presents a summary comparison between them.
Bioinformatics, motif, enumerative approach, probabilistic approach, natural-inspired, metaheuristic.
Department of Biomedical Engineering, Helwan University, Helwan, Department of Biomedical Engineering, Misr University for Science and Technology (MUST), Cairo, Department of Biomedical Engineering, Helwan University, Helwan