[Bioperl-l] blasting two identical seq yields only 88% identity

Joseph Bedell jbedell at oriongenomics.com
Sun Dec 25 11:12:42 EST 2005


Hi Anders,

What you are seeing is probably low complexity filtering. By default,
NCBI-BLAST filters for low complexity sequence (the X's that you see in
the alignments). To turn it off, you need to specify -F F on the command
line.

Joey


>-----Original Message-----
>From: bioperl-l-bounces at portal.open-bio.org [mailto:bioperl-l-
>bounces at portal.open-bio.org] On Behalf Of Anders Stegmann
>Sent: Sunday, December 25, 2005 2:58 AM
>To: bioperl-l at bioperl.org
>Subject: [Bioperl-l] blasting two identical seq yields only 88%
identity
>
>Merry christmas BioPerl!
>
>I obtained some odd result blasting a protein sequence against
>a chromosome I new encoded the protein using tblastn.
>So I tested the problem by blasting the protein against a database only
>containing the exact same protein sequence using blastp (both files
were
>fasta formated).
>I obtained an identity of only 88% instead of 100%? A lot of X'ses were
>incorporated in the query sequence.
>
>I figured that it had something to do with the database formatting so I
>tried several possibilities with no luck
>(First I tried: formatdb -i SSD1pDB.txt -p T -o F).
>
>I have had this problem before blasting nucleotides.
>What can I do about it?
>
>Regards Anders.
>
>
>_______________________________________________
>Bioperl-l mailing list
>Bioperl-l at portal.open-bio.org
>http://portal.open-bio.org/mailman/listinfo/bioperl-l



More information about the Bioperl-l mailing list