[BioPython] BLAST parser error with local web blast of multiple
queries
Jeffrey Chang
jchang at jeffchang.com
Tue Jun 17 14:16:18 EDT 2003
Hi Mike,
Thanks for the BLAST report. Yes, there are indeed changes in the WWW
format. I've updated the NCBIWWW parser to deal with them.
Unfortunately, there is no iterator for NCBIWWW output, and it's not
trivial to create one.
In general, though, the NCBIStandalone parser (which parses plain text
output) is more heavily used and better tested. I'd highly recommend
using plain text format (choose Plain text in the web form). We will
slowly deprecate the support for HTML-ized blast reports in favor of
this format.
Along the same lines, is anyone using the XML format? There is no
support for it in biopython, but perhaps there should be.
Jeff
On Tuesday, June 17, 2003, at 07:51 AM, Mike Cariaso wrote:
> Blast output that seems to choke the parser is attached.
>
> Error message is:
> Traceback (most recent call last):
> File "./blastscores.py", line 13, in ?
> b_record = b_iterator.next()
> File "/usr/lib/python2.2/site-packages/Bio/Blast/NCBIStandalone.py",
> line 367, in next
> return self._parser.parse(File.StringHandle(data))
> File "/usr/lib/python2.2/site-packages/Bio/Blast/NCBIWWW.py", line
> 47,
> in parse
> self._scanner.feed(handle, self._consumer)
> File "/usr/lib/python2.2/site-packages/Bio/Blast/NCBIWWW.py", line
> 98,
> in feed
> self._scan_header(uhandle, consumer)
> File "/usr/lib/python2.2/site-packages/Bio/Blast/NCBIWWW.py", line
> 148, in _scan_header
> read_and_call_until(uhandle, consumer.reference, start='<p>')
> File "/usr/lib/python2.2/site-packages/Bio/ParserSupport.py", line
> 366, in read_and_call_until
> line = safe_readline(uhandle)
> File "/usr/lib/python2.2/site-packages/Bio/ParserSupport.py", line
> 442, in safe_readline
> raise SyntaxError, "Unexpected end of stream."
> SyntaxError: Unexpected end of stream.
>
>
> The attached HTMLized blast output was produced by NCBI's wwwblast
> available from ftp://ftp.ncbi.nih.gov/blast/server/current_release
>
>
> My problem seems to fit in the gaps between several of the tutorial
> examples, so this may be a problem with my code, or the blast parser.
>
>
> Here is example trimmed down code:
> #!/usr/bin/env python
>
> import sys
> from Bio.Blast import NCBIWWW
> from Bio.Blast import NCBIStandalone
>
> if __name__ == '__main__':
> blast_results = open(sys.argv[1])
> b_parser = NCBIWWW.BlastParser()
> b_iterator = NCBIStandalone.Iterator(blast_results, b_parser)
> while 1:
> b_record = b_iterator.next()
> if b_record is None: break
>
> for alignment in b_record.alignments:
> for hsp in alignment.hsps:
> print '\t'.join([alignment.title,
> alignment.length,
> hsp.expect
> ])
>
>
> My thinking has been along these lines.
>
> - My blast output has been htmlized, by NCBIs tool - So I think I need
> NCBIWWW's parser.
>
> - There are multiple sequences in my fasta input - So I need an
> iterator.
>
> - NCBIWWW doesn't seem to provide an iterator, so I'm hoping to use
> NCBIStandalone's iterator. This assumption is suspect. But I don't yet
> know the biopython code base well enough to know a better alternative.
>
> Any help is greatly appreciated.
>
> Michael Cariaso
>
>
> <tyrkin2.html>_______________________________________________
> BioPython mailing list - BioPython at biopython.org
> http://biopython.org/mailman/listinfo/biopython
More information about the BioPython
mailing list