I. Introduction
Protein sequence analysis has become important area of research due to its application in drug discovery programs [1] with computational analysis becoming popular. Consider the problem of new drug development, which often takes up to 15 years and costing up to $700 million per drug under investigation [1]. Computational toolshave had the most impact in the discovery phase of drug design. In pharmaceutical drug discovery programs it is often useful to classify the sequences of proteins into a number of known families. In a mathematical notation, if it is known that a sequence is obtained for some disease , and that belongs to family , treatment for the disease is initially determined using a combination of drugs that are known to apply to [2].