Background: Microsatellites have been used extensively in the field of comparative genomics. By studying microsatellites in coding regions we have a simple model of how genotypic changes undergo selection as they are directly expressed in the phenotype as altered proteins. The simplest of these tandem repeats in coding regions are the tri-nucleotide repeats which produce a repeat of a single amino acid when translated into proteins. Tri-nucleotide repeats are often disease associated, and are also known to be unstable to both expansion and contraction. This makes them sensitive markers for studying proteome evolution, in closely related species. Results: The evolutionary history of the family of malarial causing parasites Plasmodia is complex because of the life-cycle of the organism, where it interacts with a number of different hosts and goes through a series of tissue specific stages. This study shows that the divergence between the primate and rodent malarial parasites has resulted in a lineage specific change in the simple amino acid repeat distribution that is correlated to A–T content. The paper also shows that this altered use of amino acids in SAARs is consistent with the repeat distributions being under selective pressure. Conclusions: The study shows that simple amino acid repeat distributions can be used to group related species and to examine their phylogenetic relationships. This study also shows that an outgroup species with a similar A–T content can be distinguished based only on the amino acid usage in repeats, and suggest that this might be a useful feature for proteome clustering. The lineage specific use of amino acids in repeat regions suggests that comparative studies of SAAR distributions between proteomes gives an insight into the mechanisms of expansion and the selective pressures acting on the organism. |