|dc.description.abstract||Background: Phlebotomus papatasi is a natural vector of Leishmania major, which causes cutaneous leishmaniasis
in many countries. Simple sequence repeats (SSRs), or microsatellites, are common in eukaryotic genomes and are
short, repeated nucleotide sequence elements arrayed in tandem and flanked by non-repetitive regions. The
enrichment methods used previously for finding new microsatellite loci in sand flies remain laborious and time
consuming; in silico mining, which includes retrieval and screening of microsatellites from large amounts of
sequence data from sequence data bases using microsatellite search tools can yield many new candidate markers.
Results: Simple sequence repeats (SSRs) were characterized in P. papatasi expressed sequence tags (ESTs) derived
from a public database, National Center for Biotechnology Information (NCBI). A total of 42,784 sequences were
mined, and 1,499 SSRs were identified with a frequency of 3.5% and an average density of 15.55 kb per SSR.
Dinucleotide motifs were the most common SSRs, accounting for 67% followed by tri-, tetra-, and penta-nucleotide
repeats, accounting for 31.1%, 1.5%, and 0.1%, respectively. The length of microsatellites varied from 5 to 16
repeats. Dinucleotide types; AG and CT have the highest frequency. Dinucleotide SSR-ESTs are relatively biased
toward an excess of (AX)n repeats and a low GC base content. Forty primer pairs were designed based on motif
lengths for further experimental validation.
Conclusion: The first large-scale survey of SSRs derived from P. papatasi is presented; dinucleotide SSRs identified
are more frequent than other types. EST data mining is an effective strategy to identify functional microsatellites in