[BioPython] Problem parsing Blast XML output from different sources
Steffi Gebauer-Jung
gebauer-jung at ice.mpg.de
Fri Oct 13 13:23:35 UTC 2006
Hello Michiel,
the fix works fine.
Thanks for the fast reply and fixing!
Maybe there should be a hint for other users
not to use the frame information of the blast xml output
and to test the start/end positions of the hsp sequences instead,
and to be aware of reverse query sequences.
For my needs I have to have the query sequence in forward direction.
That's why I try to reverse-complement the complete alignment
if this isn't the case yet.
Thereby I found, that Bio.Seq.Seq.complement() cannot handle unicode
sequences,
in spite of Bio.Seq.Seq might be initialized with unicode strings:
>>> import Bio.Seq
>>> s = Bio.Seq.Seq(u'acgt')
>>> s
Seq(u'acgt', Alphabet())
>>> s.complement()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.5/site-packages/Bio/Seq.py", line 101, in
complement
s = self.data.translate(ttable)
TypeError: character mapping must return integer, None or unicode
And just another idea:
In order to (reverse)complement aligned sequences it would be useful to
have
the gap sign '-' in the alphabets.
Steffi
Michiel Jan Laurens de Hoon wrote:
> Hi Steffi,
>
> I had the same result when running Blast locally.
>
> I added hsp.query_end and hsp.sbjct_end to the Blast XML parser, so
> you can get around this problem. Could you try the fixed Blast parser?
> You'll need to pick up Bio/Blast/NCBIXML.py and Bio/Blast/Record.py from
> http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/biopython/Bio/Blast/?cvsroot=biopython
>
>
> If it works fine (or if it doesn't), please send a message to the
> Biopython mailing list (instead of my email address), so that this
> gets into the mailing list archives.
>
> --Michiel.
>
>
> Steffi Gebauer-Jung wrote:
>
>> Hello,
>>
>> the db was downloaded from ftp://ftp.ncbi.nih.gov//blast/db/patnt.tar.gz
>>
>> In fact the special query sequence and db shouldn't matter.
>>
>> If you have any 'Plus / Minus' HSP in a pairwise BlastN output
>> you can run BlastN again in order to get the xml formatted output.
>>
>> Comparing the special HSP in both formats you should see the effect.
>
>
>
More information about the Biopython
mailing list