A Fast-Optimal DNA Sequences Similarity Search

Mohd Saman, Md Yazid and Abd Rahman, Mohd Nordin and Ahmad, Aziz and M Tap, A Osman (2006) A Fast-Optimal DNA Sequences Similarity Search. WSEAS Transactions on Computer, 7 (5). pp. 1525-1532. ISSN 1109-2750

[img] Text
Restricted to Registered users only

Download (161Kb) | Request a copy


A routine operation for a biologist is to query a new discovered DNA sequence against a collection of sequence databases to find a list of similar sequences. The obtained results are used to infer the function of the query sequence. The size of DNA databases are growth exponentially every year. Consequently, algorithms that find optimal sensitive results of sequence similarity can be time-consuming. Frequently, quadratic running time complexity dynamic programming algorithms used to produce a local optimal sequence alignment. However, this algorithm is cost-prohibitive in dealing with a long DNA sequences. By means of local alignment, this paper presents a framework to search a set of similar sequences in a large scale of DNA databases with optimal output and minimum cost. The Knuth-Morris-Pratt algorithm (KMP) is adapted and acts as a filtering mechanism before exhaustive dynamic programming is applied. The KMP algorithm is used to scan the generated patterns from query sequence to the sequences in databases. This filtering process generates scores which are used for ranking purposes. The Smith-Waterman algorithm then is applied to each sequence starting from the top of the constructed ranking. The paper also discusses the optimal patterns length that is highly appropriate for the databases scanning processes. The results from an experiment show that this filtering mechanism would discard irrelevant sequences from executed for Smith-Waterman algorithm. Therefore, the time for searching and retrieving the set of similar sequences from databases to the query is minimized.

Item Type: Article
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Faculty / Institute: Faculty of Informatics & Computing
Depositing User: Dr Mohd Nordin Abdul Rahman
Date Deposited: 11 Aug 2016 03:24
Last Modified: 11 Aug 2016 03:24
URI: http://erep.unisza.edu.my/id/eprint/1175

Actions (login required)

View Item View Item