[Bioperl-l] Fwd: nonstop repeated output from Remote_blast with xml

Chris Fields cjfields at uiuc.edu
Mon Aug 6 17:49:08 UTC 2007


Wasn't paying attention! Forwarding this to the mail list in case  
anyone wanted the answer...

chris

Begin forwarded message:

> From: Chris Fields <cjfields at uiuc.edu>
> Date: August 6, 2007 12:10:37 PM CDT
> To: gyang at plantbio.uga.edu
> Subject: Re: [Bioperl-l] nonstop repeated output from Remote_blast  
> with xml
>
> Guojun,
>
> Sorry about the long wait on this.  At this time RemoteBlast  
> doesn't automatically set the retrieval header to return XML when  
> setting the -reporttype parameter to 'xml' or 'blastxml'.  The  
> default is text output, so you are retrieving regular text BLAST  
> reports instead of XML, hence the reported XML parser failure (BTW,  
> you can see the plain text being returned in the debugging  
> output).  I'll look into a fix for that.
>
> In the meantime, you can do this manually by setting the following  
> key prior to submitting the BLAST run:
>
> $Bio::Tools::Run::RemoteBlast::RETRIEVALHEADER{'FORMAT_TYPE'} = 'XML';
>
> When I run your example with the above line added it works fine.   
> As an additional note, the CVS version of Bio::SearchIO::blastxml  
> now supports newer versions of XML::SAX::Expat; the problem there  
> was a bug in XML::SAX::Expat that killed parsing.
>
> Additional rant before I go back to work (you can skip this if  
> needed):  RemoteBlast is one of the most used modules in BioPerl,  
> but it is also the most problematic as NCBI keeps changing things  
> on their end (BLAST text output, prompts when returning RIDs,  
> etc).  It frankly isn't as well-maintained as we would like; this  
> is partly due to plans we have (but unfortunately haven't acted  
> upon) to merge RemoteBlast/StandAloneBlast so they have a similar  
> API and can be used for any BLAST program, including netblast.  If  
> someone wants to take this on at some point then they are more than  
> welcome!
>
> chris
>
> On Aug 3, 2007, at 10:08 AM, Guojun Yang wrote:
>
>> Thanks, Chris,
>> Attached are my script and the query file. I suspected that we  
>> need to add "remove RID... in the code", I tried putting romoving  
>> RID at the end of the parsing coding, but it seemed it removed it  
>> even before the output was processed.   I installed  
>> XML::SAX::Expat, the error became "XML::SAX::Expat is no longer  
>> supported...", so I installed ExpatXS, the error message becomes:
>>
>> -------------------- WARNING ---------------------
>> MSG: error in parsing a report:
>>  no element found at line 4126, column 1, byte 186628 at /usr/lib/ 
>> perl5/site_perl/5.8.3/Bio/SearchIO/blastxml.pm line 304
>>
>>
>> Would you please try the script with the query file with the  
>> following input parameters, to see what happens on your machine (I  
>> want to make sure there is no installation problem on my machine).  
>> The search subroutine is where blast is performed, I did not  
>> include a romove RID there. Thanks again!
>>
>> master:/home/guojun # perl makcgi07.txt
>> Query file name:
>> kiddo.txt
>> Select a function: 1.member;2.RES; 3, long; 4.Anchor; 5.Associator.
>> 1
>> Type in the name of an organism, e.g. Oryza sativa.
>> Oryza sativa
>> Type in the organism to search for RES:
>> Your E_value:
>> 0.001
>> Size limit for ancestor element:
>> 4000
>> Flanking size for retrieved members:
>> 50
>> Tolerance for end mismatch:
>> 0
>>
>>
>>
>> Guojun From: Chris Fields [mailto:cjfields at uiuc.edu]
>> To: gyang at plantbio.uga.edu
>> Sent: Thu, 02 Aug 2007 13:04:59 -0400
>> Subject: Re: [Bioperl-l] nonstop repeated output from Remote_blast  
>> with xml
>>
>> Guojun,
>>
>> Make sure to keep this on the mail list for archiving purposes.
>>
>> It could be that the RID is not being removed properly (if it isn't
>> removed then you will repeatedly retrieve your BLAST report). The
>> new error you are seeing may be coming from whatever XML::SAX backend
>> parser is being used (XML::SAX::ExpatXS, XML::SAX::Expat, etc); it
>> doesn't look bioperl-related and there is an eval which catches this
>> stuff in SearchIO::blastxml. Does text parsing work?
>>
>> Could you directly send me your script or add it to a new bug report
>> as an attachment?
>>
>> http://www.bioperl.org/wiki/Bugs
>>
>> chris
>>
>> On Aug 2, 2007, at 11:07 AM, Guojun Yang wrote:
>>
>> > Hi,Chris,
>> > I installed the latest version of bioperl, in addition to the
>> > repeated output problem, there are new problems with parsing:
>> >
>> >
>> > -------------------- WARNING ---------------------
>> > MSG: error in parsing a report:
>> > No close tag marker [Ln: 4126, Col: 0]
>> >
>> > ---------------------------------------------------
>> >
>> > Would you please kindly give me a hint on this,
>> > Thanks a lot,
>> > Guojun
>> >
>> >
>> > ----- Original Message -----
>> > From: Chris Fields [mailto:cjfields at uiuc.edu]
>> > To: gyang at plantbio.uga.edu
>> > Cc: bioperl-l List [mailto:bioperl-l at lists.open-bio.org]
>> > Subject: Re: [Bioperl-l] nonstop repeated output from Remote_blast
>> > with xml
>> >
>> >
>> >> Make sure to keep responses on the ail list.
>> >>> You might want to run a full install, just in case. If I remember
>> >> correctly Sendu made some changes a while back in the BLAST- 
>> related
>> >> modules which may be related to this. At the very least install/
>> >> upgrade all modules in Bio::Tools::Run.
>> >>> chris
>> >>> On Jul 31, 2007, at 9:40 AM, Guojun Yang wrote:
>> >>>> Thanks, Chris,
>> >>> But when I replaced the old RemoteBlast.pm with the new one, I  
>> got
>> >>> "can't locate the object method "retrieve_parameter"". Does this
>> >>> mean I need to install something else?
>> >>> Guojun
>> >>>
>> >>> ----- Original Message -----
>> >>> From: Chris Fields [mailto:cjfields at uiuc.edu]
>> >>> To: gyang at plantbio.uga.edu
>> >>> Cc: bioperl-l at bioperl.org
>> >>> Subject: Re: [Bioperl-l] nonstop repeated output from  
>> Remote_blast
>> >>> with xml
>> >>>
>> >>>
>> >>>>> On Jul 30, 2007, at 3:58 PM, Guojun Yang wrote:
>> >>>>>> I am running remoteblast and using readmethod "xml", I noticed
>> >>>>>> that
>> >>>>> it is printing the output repeatedly nonstop. It's like in a  
>> loop.
>> >>>>> Did anybody notice this before? Can anybody help me getting  
>> out of
>> >>>>> this?
>> >>>>> Thanks a lot,
>> >>>>>
>> >>>>>
>> >>>>> Guojun Yang
>> >>>>> University of Georgia
>> >>>>> Not seeing that using bioperl-live; you may need to update
>> >>>> RemoteBlast.pm as this sounds similar to an issue that popped up
>> >>>> earlier in the spring.
>> >>>>> chris
>> >>>>
>> >>> Christopher Fields
>> >> Postdoctoral Researcher
>> >> Lab of Dr. Robert Switzer
>> >> Dept of Biochemistry
>> >> University of Illinois Urbana-Champaign
>> >>>>>
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>>
>>
>> <makcgi07.txt>
>> <kiddo.txt>
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign






More information about the Bioperl-l mailing list