[Bioperl-l] Difference between
Alan Li
immunoguest at hotmail.com
Sat Jan 31 18:25:45 EST 2004
I would like to thank everyone for their responses.
And yes, Mat is right about this being an issue with the XML output of
stand-alone blast. I tried comparing the results of just the stand-alone
blast using different -F flags. The results below shows that if "-F F" is
set the results are the same, but are different when using "-F T" for the
XML output.
So is there anything I could do to make the XML results the same when the
filtering option is set to true? Perhaps either through another blast
parameter or by doing it programmatically?
-----------------------------------------------------------------------
blastall -p blastn -m 7 -F T -d ecoli/ecoli.nt -i test.txt
<Hit>
<Hit_num>1</Hit_num>
<Hit_id>gi|1786181|gb|AE000111.1|AE000111</Hit_id>
<Hit_def>Escherichia coli K-12 MG1655 section 1 of 400 of the
complete genome</Hit_def>
<Hit_accession>AE000111</Hit_accession>
<Hit_len>10596</Hit_len>
<Hit_hsps>
<Hsp>
<Hsp_num>1</Hsp_num>
<Hsp_bit-score>589.253</Hsp_bit-score>
<Hsp_score>297</Hsp_score>
<Hsp_evalue>1.04898e-168</Hsp_evalue>
<Hsp_query-from>237</Hsp_query-from>
<Hsp_query-to>560</Hsp_query-to>
<Hsp_hit-from>237</Hsp_hit-from>
<Hsp_hit-to>560</Hsp_hit-to>
<Hsp_query-frame>1</Hsp_query-frame>
<Hsp_hit-frame>1</Hsp_hit-frame>
<Hsp_identity>324</Hsp_identity>
<Hsp_positive>324</Hsp_positive>
<Hsp_align-len>324</Hsp_align-len>
<Hsp_qseq>AGGTAACGGTGCGGGCTGACGCGTACAGGAAACACAGAAAAAAGCCCGCACCTGACAGTGCGGGCTTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAAGTTCGGCGGTACATCAGTGGCAAATGCAGAACGTTTTCTGCGTGTTGCCGATATTCTGGAAAGCAATGCCAGGCAGGGGCAGGTGGCCACCGTCCTCTCTGCCCCCGCCAAAATCACCAACCACCTGGTGGCGATGATTGAAAAAACCATTAGCGGCCAGGATGCTTTACCCAATATCAGCGATGCCGAACGTATTTTTGCCGAACTTTT</Hsp_qseq>
<Hsp_hseq>AGGTAACGGTGCGGGCTGACGCGTACAGGAAACACAGAAAAAAGCCCGCACCTGACAGTGCGGGCTTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAAGTTCGGCGGTACATCAGTGGCAAATGCAGAACGTTTTCTGCGTGTTGCCGATATTCTGGAAAGCAATGCCAGGCAGGGGCAGGTGGCCACCGTCCTCTCTGCCCCCGCCAAAATCACCAACCACCTGGTGGCGATGATTGAAAAAACCATTAGCGGCCAGGATGCTTTACCCAATATCAGCGATGCCGAACGTATTTTTGCCGAACTTTT</Hsp_hseq>
<Hsp_midline>||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||</Hsp_midline>
</Hsp>
-----------------------------------------------------------------------
blastall -p blastn -m 0 -F T -d ecoli/ecoli.nt -i test.txt
>gb|AE000111.1|AE000111 Escherichia coli K-12 MG1655 section 1 of 400 of the
>complete
genome
Length = 10596
Score = 589 bits (297), Expect = e-168
Identities = 315/324 (97%)
Strand = Plus / Plus
Query: 237 aggtaacggtgcgggctgacgcgtacaggaaacacagaaaaaagcccgcacctgacagtg 296
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 237 aggtaacggtgcgggctgacgcgtacaggaaacacagaaaaaagcccgcacctgacagtg 296
Query: 297 cgggcnnnnnnnnncgaccaaaggtaacgaggtaacaaccatgcgagtgttgaagttcgg 356
||||| ||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 297 cgggctttttttttcgaccaaaggtaacgaggtaacaaccatgcgagtgttgaagttcgg 356
Query: 357 cggtacatcagtggcaaatgcagaacgttttctgcgtgttgccgatattctggaaagcaa 416
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 357 cggtacatcagtggcaaatgcagaacgttttctgcgtgttgccgatattctggaaagcaa 416
Query: 417 tgccaggcaggggcaggtggccaccgtcctctctgcccccgccaaaatcaccaaccacct 476
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 417 tgccaggcaggggcaggtggccaccgtcctctctgcccccgccaaaatcaccaaccacct 476
Query: 477 ggtggcgatgattgaaaaaaccattagcggccaggatgctttacccaatatcagcgatgc 536
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 477 ggtggcgatgattgaaaaaaccattagcggccaggatgctttacccaatatcagcgatgc 536
Query: 537 cgaacgtatttttgccgaactttt 560
||||||||||||||||||||||||
Sbjct: 537 cgaacgtatttttgccgaactttt 560
-----------------------------------------------------------------------
blastall -p blastn -m 7 -F F -d ecoli/ecoli.nt -i test.txt
<Hit>
<Hit_num>1</Hit_num>
<Hit_id>gi|1786181|gb|AE000111.1|AE000111</Hit_id>
<Hit_def>Escherichia coli K-12 MG1655 section 1 of 400 of the
complete genome</Hit_def>
<Hit_accession>AE000111</Hit_accession>
<Hit_len>10596</Hit_len>
<Hit_hsps>
<Hsp>
<Hsp_num>1</Hsp_num>
<Hsp_bit-score>1110.61</Hsp_bit-score>
<Hsp_score>560</Hsp_score>
<Hsp_evalue>0</Hsp_evalue>
<Hsp_query-from>1</Hsp_query-from>
<Hsp_query-to>560</Hsp_query-to>
<Hsp_hit-from>1</Hsp_hit-from>
<Hsp_hit-to>560</Hsp_hit-to>
<Hsp_query-frame>1</Hsp_query-frame>
<Hsp_hit-frame>1</Hsp_hit-frame>
<Hsp_identity>560</Hsp_identity>
<Hsp_positive>560</Hsp_positive>
<Hsp_align-len>560</Hsp_align-len>
<Hsp_qseq>AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGCTTCTGAACTGGTTACCTGCCGTGAGTAAATTAAAATTTTATTGACTTAGGTCACTAAATACTTTAACCAATATAGGCATAGCGCACAGACAGATAAAAATTACAGAGTACACAACATCCATGAAACGCATTAGCACCACCATTACCACCACCATCACCATTACCACAGGTAACGGTGCGGGCTGACGCGTACAGGAAACACAGAAAAAAGCCCGCACCTGACAGTGCGGGCTTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAAGTTCGGCGGTACATCAGTGGCAAATGCAGAACGTTTTCTGCGTGTTGCCGATATTCTGGAAAGCAATGCCAGGCAGGGGCAGGTGGCCACCGTCCTCTCTGCCCCCGCCAAAATCACCAACCACCTGGTGGCGATGATTGAAAAAACCATTAGCGGCCAGGATGCTTTACCCAATATCAGCGATGCCGAACGTATTTTTGCCGAACTTTT</Hsp_qseq>
<Hsp_hseq>AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGCTTCTGAACTGGTTACCTGCCGTGAGTAAATTAAAATTTTATTGACTTAGGTCACTAAATACTTTAACCAATATAGGCATAGCGCACAGACAGATAAAAATTACAGAGTACACAACATCCATGAAACGCATTAGCACCACCATTACCACCACCATCACCATTACCACAGGTAACGGTGCGGGCTGACGCGTACAGGAAACACAGAAAAAAGCCCGCACCTGACAGTGCGGGCTTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAAGTTCGGCGGTACATCAGTGGCAAATGCAGAACGTTTTCTGCGTGTTGCCGATATTCTGGAAAGCAATGCCAGGCAGGGGCAGGTGGCCACCGTCCTCTCTGCCCCCGCCAAAATCACCAACCACCTGGTGGCGATGATTGAAAAAACCATTAGCGGCCAGGATGCTTTACCCAATATCAGCGATGCCGAACGTATTTTTGCCGAACTTTT</Hsp_hseq>
<Hsp_midline>||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||</Hsp_midline>
</Hsp>
-----------------------------------------------------------------------
blastall -p blastn -m 0 -F F -d ecoli/ecoli.nt -i test.txt
>gb|AE000111.1|AE000111 Escherichia coli K-12 MG1655 section 1 of 400 of the
>complete
genome
Length = 10596
Score = 1110 bits (560), Expect = 0.0
Identities = 560/560 (100%)
Strand = Plus / Plus
Query: 1 agcttttcattctgactgcaacgggcaatatgtctctgtgtggattaaaaaaagagtgtc 60
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 1 agcttttcattctgactgcaacgggcaatatgtctctgtgtggattaaaaaaagagtgtc 60
Query: 61 tgatagcagcttctgaactggttacctgccgtgagtaaattaaaattttattgacttagg 120
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 61 tgatagcagcttctgaactggttacctgccgtgagtaaattaaaattttattgacttagg 120
Query: 121 tcactaaatactttaaccaatataggcatagcgcacagacagataaaaattacagagtac 180
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 121 tcactaaatactttaaccaatataggcatagcgcacagacagataaaaattacagagtac 180
Query: 181 acaacatccatgaaacgcattagcaccaccattaccaccaccatcaccattaccacaggt 240
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 181 acaacatccatgaaacgcattagcaccaccattaccaccaccatcaccattaccacaggt 240
Query: 241 aacggtgcgggctgacgcgtacaggaaacacagaaaaaagcccgcacctgacagtgcggg 300
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 241 aacggtgcgggctgacgcgtacaggaaacacagaaaaaagcccgcacctgacagtgcggg 300
Query: 301 ctttttttttcgaccaaaggtaacgaggtaacaaccatgcgagtgttgaagttcggcggt 360
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 301 ctttttttttcgaccaaaggtaacgaggtaacaaccatgcgagtgttgaagttcggcggt 360
Query: 361 acatcagtggcaaatgcagaacgttttctgcgtgttgccgatattctggaaagcaatgcc 420
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 361 acatcagtggcaaatgcagaacgttttctgcgtgttgccgatattctggaaagcaatgcc 420
Query: 421 aggcaggggcaggtggccaccgtcctctctgcccccgccaaaatcaccaaccacctggtg 480
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 421 aggcaggggcaggtggccaccgtcctctctgcccccgccaaaatcaccaaccacctggtg 480
Query: 481 gcgatgattgaaaaaaccattagcggccaggatgctttacccaatatcagcgatgccgaa 540
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 481 gcgatgattgaaaaaaccattagcggccaggatgctttacccaatatcagcgatgccgaa 540
Query: 541 cgtatttttgccgaactttt 560
||||||||||||||||||||
Sbjct: 541 cgtatttttgccgaactttt 560
>From: "Wiepert, Mathieu" <Wiepert.Mathieu at mayo.edu>
>To: 'tai kwan do' <immunoguest at hotmail.com>, bioperl-l at bioperl.org
>Subject: RE: [Bioperl-l] Difference between Date: Fri, 30 Jan 2004 11:13:05
>-0600
>
>Hi,
>
>I have a vague recollection of this problem, so this answer is likely
>wrong, but I think it has something to do with the filtered sequence? You
>have 9 masked NT's, so it is probably a difference in the defaults, and
>something to do with the XML output not masked?
>
>Sorry I can't find the emails I had with NCBI on this, but I am maybe 70%
>sure that it is a problem like that, with defaults on the local server
>versus NCBI, and the XML not using masked data?
>
>Someone else chime in if I am way off there...
>
>HTH,
>
>-mat
>
_________________________________________________________________
There are now three new levels of MSN Hotmail Extra Storage! Learn more.
http://join.msn.com/?pgmarket=en-us&page=hotmail/es2&ST=1
More information about the Bioperl-l
mailing list