[BioPython] Need help parsing Blastoutput

Michiel De Hoon mdehoon at c2b2.columbia.edu
Mon Apr 24 18:27:31 UTC 2006


Also, make sure you have the latest version of Bio/Blast/NCBIStandalone.py;
you can get it from here:

http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/*checkout*/biopython/Bio
/Blast/NCBIStandalone.py?rev=1.60&cvsroot=biopython&content-type=text/plain

--Michiel.

Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032



-----Original Message-----
From: Halima Rabiu [mailto:halima at cbio.uct.ac.za]
Sent: Mon 4/24/2006 4:45 AM
To: Michiel De Hoon
Cc: biopython at lists.open-bio.org
Subject: RE: [BioPython] Need help parsing Blastoutput
 
Hi 
attch here is the output xml out I also attached it in my previous post 
thanks
Halimah

On Thu, 20 Apr 2006, Michiel De Hoon wrote:

> Could you send us the Blast XML output also?
> 
> --Michiel.
> 
> Michiel de Hoon
> Center for Computational Biology and Bioinformatics
> Columbia University
> 1150 St Nicholas Avenue
> New York, NY 10032
> 
> 
> 
> -----Original Message-----
> From: Halima Rabiu [mailto:halima at cbio.uct.ac.za]
> Sent: Thu 4/20/2006 7:57 AM
> To: Michiel De Hoon
> Cc: biopython at lists.open-bio.org
> Subject: RE: [BioPython] Need help parsing Blastoutput
>  
> thanks I try using XML parser and I am still geting errors which I dont 
> understand . please see the attchmnt copy of my script and Blast XML 
> output.
> here is the error
> raceback (most recent call last):
>   File "Bioperser.py", line 11, in ?
>     b_record = b_parser.parse(b_out)
>   File "/usr/local/lib/python2.4/site-packages/Bio/Blast/NCBIXML.py", line 
> 112, in parse
>     self._parser.parse(handler)
>   File "/usr/local//lib/python2.4/xml/sax/expatreader.py", line 107, in 
> parse
>     xmlreader.IncrementalParser.parse(self, source)
>   File "/usr/local//lib/python2.4/xml/sax/xmlreader.py", line 123, in 
> parse
>     self.feed(buffer)
>   File "/usr/local//lib/python2.4/xml/sax/expatreader.py", line 211, in 
> feed
>     self._err_handler.fatalError(exc)
>   File "/usr/local//lib/python2.4/xml/sax/handler.py", line 38, in 
> fatalError
>     raise exception
> thanks
> Halimah
> 
> On Wed, 19 Apr 2006, Michiel De Hoon wrote:
> 
> > The Blast parser fails to read your file because the format of Blast
output
> > has changed. If I edit the data file so that it corresponds to the old
> format
> > (add a space here, remove a blank line there, etc.), the Blast parser
reads
> > the file without problems. The easiest solution is to repeat the Blast
run,
> > using XML for the output format, and use the Blast XML parser in
Biopython
> to
> > parse the results.
> > 
> > A general question is if anybody still needs the parser for Blast text
> > output. Currently, we are confusing our users by having a Blast text
parser
> > that tends to break. A broken parser may be worse than no parser.
> > 
> > --Michiel.
> > 
> > Michiel de Hoon
> > Center for Computational Biology and Bioinformatics
> > Columbia University
> > 1150 St Nicholas Avenue
> > New York, NY 10032
> > 
> > 
> > 
> > -----Original Message-----
> > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za]
> > Sent: Wed 4/19/2006 6:15 AM
> > To: Michiel De Hoon
> > Cc: biopython at lists.open-bio.org
> > Subject: RE: [BioPython] Need help parsing Blastoutput
> >  
> > Hi 
> > Please see the attachment,it part of my Blast output.
> > yes I am try to parse text output from Blast ,I have use another script
to 
> > run my local blast that I am trying to perse the
NCBIStandalone.BlastParser
> 
> > was working fine without hsp.sbject_end  which is one of what I need to 
> > print out .
> > On checking the class diagrams from cookbook, findout that sbject_end is 
> > not included .I just need another way of printing the int(subject end).
> > Thanks for your help
> > Halimah
> > 
> > On Tue, 18 Apr 2006, Michiel De Hoon wrote:
> > 
> > > Could you also send us the file Enterococcus_out so we can run the
> script?
> > > 
> > > From the script, it looks like you're trying to parse text output from
> > Blast.
> > > While this is possible (in theory), the format of Blast text output
tends
> > to
> > > change a lot, thereby breaking the parser in Biopython. It is more
> reliable
> > > to have Blast generate output in XML format, and use the XML parser:
> > > 
> > > blast_out = open('my_blast.xml', 'r')
> > > 
> > > from Bio.Blast import NCBIXML
> > > 
> > > b_parser = NCBIXML.BlastParser()
> > > b_record = b_parser.parse(blast_out)
> > > 
> > > See section 3.1.2 in the Biopython cookbook, and section 3.1.4 on how
to
> > > generate Blast output in XML.
> > > 
> > > --Michiel.
> > > 
> > > 
> > > 
> > > Michiel de Hoon
> > > Center for Computational Biology and Bioinformatics
> > > Columbia University
> > > 1150 St Nicholas Avenue
> > > New York, NY 10032
> > > 
> > > 
> > > 
> > > -----Original Message-----
> > > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za]
> > > Sent: Tue 4/18/2006 11:06 AM
> > > To: Michiel De Hoon
> > > Cc: biopython at lists.open-bio.org
> > > Subject: RE: [BioPython] Need help parsing Blastoutput
> > >  
> > > thanks
> > > please see the attchment a copy of my script and copy of my Blast
output
> > > Thanks
> > > 
> > > 
> > > On Thu, 13 Apr 2006, Michiel De Hoon wrote:
> > > 
> > > > Could you send us the script you were using?
> > > > 
> > > > --Michiel.
> > > > 
> > > > Michiel de Hoon
> > > > Center for Computational Biology and Bioinformatics
> > > > Columbia University
> > > > 1150 St Nicholas Avenue
> > > > New York, NY 10032
> > > > 
> > > > 
> > > > 
> > > > -----Original Message-----
> > > > From: biopython-bounces at lists.open-bio.org on behalf of Halima Rabiu
> > > > Sent: Thu 4/13/2006 11:07 AM
> > > > To: biopython at lists.open-bio.org
> > > > Subject: [BioPython] Need help parsing Blastoutput
> > > >  
> > > > Hi All,
> > > > I have a BLAST output from a local blast
> > > > I need to calculate my % alignment coverage as regard to my subject
> > > > I try parsed the blast output and wanted to print the
> > > > sbjct Start and Sbjct end. but I could not is there anyway I could
this
> 
> > > > try to get mach coverage between my querry and subject I dont need 
> > > > Identities,but total % alignment for querry or subject.
> > > > Thanks
> > > > Halimah
> > > > 
> > > > _______________________________________________
> > > > BioPython mailing list  -  BioPython at lists.open-bio.org
> > > > http://lists.open-bio.org/mailman/listinfo/biopython
> > > > 
> > > > 
> > > 
> > > 
> > 
> > 
> 
> 





More information about the Biopython mailing list