[Bioperl-l] blasting two identical seq yields only 88% identity
William Hsiao
wlhsiao at yahoo.ca
Sun Dec 25 11:15:16 EST 2005
Hi Anders,
This is due to BLAST's low complexity filter
(http://www.ncbi.nlm.nih.gov/blast/blast_FAQs.shtml#LCR)
which masks low complexity regions as X's. These X's
are taken into consideration when calculating %
identity resulting in less than 100% identity for two
identical sequences. You can turn the filter off then
you should see 100% identity.
Cheers,
Will
--- Anders Stegmann <anst at kvl.dk> wrote:
> Merry christmas BioPerl!
>
> I obtained some odd result blasting a protein
> sequence against
> a chromosome I new encoded the protein using
> tblastn.
> So I tested the problem by blasting the protein
> against a database only containing the exact same
> protein sequence using blastp (both files were fasta
> formated).
> I obtained an identity of only 88% instead of 100%?
> A lot of X'ses were incorporated in the query
> sequence.
>
> I figured that it had something to do with the
> database formatting so I tried several possibilities
> with no luck
> (First I tried: formatdb -i SSD1pDB.txt -p T -o F).
>
> I have had this problem before blasting nucleotides.
> What can I do about it?
>
> Regards Anders.
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
>
http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
__________________________________________________________
Find your next car at http://autos.yahoo.ca
More information about the Bioperl-l
mailing list