1. INTRODUCTION
Querying or aligning biological sequences, such as deoxyribose nucleic acid or DNA and proteins, refers to arranging the sequences in order to identify regions of significant similarity. It involves obtaining an exact or close match of a query sequence amongst sequences in a database. This match may be attributed to a functional relationship between the query and a sequence as a result of possible mutations or locally conserved regions in distant sequences. This exact match can occur over the entire length of the query sequence (globalized match) [1] or over smaller segments or sub-sequences of the query (localized match) [2]. In general, database systems use algebraic operators and indexing techniques based on dynamic-programming approaches [3]–[6] to process sophisticated queries. However, these techniques have some drawbacks: (a) the query must be indexed and pre-processed prior to the process; (b) the method is insensitive to alignments over repetitive or periodic segments; and (c) not all methods are capable of handling large query lengths. Querying methodologies based on signal processing techniques are based on cross-correlation between numerically-mapped sequences [7]–[10]. Although these algorithms are robust to querying long sequences, most of them are insensitive to localized querying and cannot always handle alignments with repetitive segments or low-complexity regions [11]. They also have not extensively addressed short gaps in alignment due to properties dropped or gained during evolution, corresponding to insertions or deletions of nucleotides or amino acids in the sequence composition.