[Bioperl-l] NCBI BLAST results parsing

Thomas J Keller kellert at ohsu.edu
Thu Jun 8 18:39:04 UTC 2006


I'm having the same problem bp_remote_blast.pl worked yesterday,  
today it's busted. Incidently, I got the following email from NCBI  
this morning:
The new version of the NCBI SOAP E-Utilities, which includes recent
changes to the NCBI sequence databases schema, was released today.

Thank you.
NCBI E-Utilities Team

I wouldn't have thought that that would affect  
Bio::Tools::RemoteBlast but something has changed.

Here's a snippet of the output after $ bp_remote_blast.pl -p blastn - 
d nr -e 1e-3 -i nm_008540.fasta

-------------------- WARNING ---------------------
MSG: req was POST http://www.ncbi.nlm.nih.gov/blast/Blast.cgi
User-Agent: bioperl-Bio_Tools_Run_RemoteBlast/1.4
Content-Length: 267
Content-Type: application/x-www-form-urlencoded

DATABASE=nr&COMPOSITION_BASED_STATISTICS=off&QUERY=%3ENM_008540_2927+% 
25GC+55.0+Score+5+Mus+musculus+MAD+homolog+4+(Drosophila)+(Smad4)%2C 
+mRNA.% 
0Acactctgcctgctgcttcactgt&EXPECT=1e-3&SERVICE=plain&FORMAT_OBJECT=Alignm 
ent&CMD=Put&FILTER=L&CDD_SEARCH=off&PROGRAM=blastn


---------------------------------------------------

-------------------- WARNING ---------------------
MSG: <html><head><title>NCBI Blast</title><meta http-equiv="Content- 
Type" content="text/html; charset=utf-8"/><link rel="stylesheet"  
href="http://www.ncbi.nlm.nih.gov/corehtml/ncbi.css"></head><body  
bgcolor="#FFFFFF" text="#000000" link="#CC6600" vlink="#CC6600"  
onload="StartBlastCgi();"><!--  the header   --> <table border="0"  
width="600" cellspacing="0" cellpadding="0"><tr>     <td width="600"  
colspan=4>    <map name="head_img_map">    <area shape="rect"  
coords="0,0,300,40" href="http://www.ncbi.nlm.nih.gov" alt="NCBI home  
page">       <area shape="rect" coords="301,0,600,40" href="http:// 
www.ncbi.nlm.nih.gov/blast" alt="NCBI BLAST home page">    </map>     
<IMG SRC="html/blastheader.gif" USEMAP="#head_img_map" BORDER="0"  
NAME="BlastHeaderGif" ALT="BLAST header image" WIDTH="600"  
HEIGHT="45" BORDER="0" ALIGN="middle">    </td></tr><tr  
align="center">    <td width="150" bgcolor="#003366">        <a  
href="http://www.ncbi.nlm.nih.gov/blast/Blast.cgi? 
CMD=Web&LAYOUT=TwoWindows&AUTO_FORMAT=Semiauto&PAGE=Nucleotides&NCBI_GI= 
yes&FILTER=L&HITLIST_SIZE=100&SHOW_OVERVIEW=yes&AUTO_FORMAT=yes&SHOW_LIN 
KOUT=yes"        class="HELPBAR"><FONT COLOR="#FFFFFF">Nucleotide</ 
FONT></a></td>    <td width="150" bgcolor="#003366">        <a  
href="http://www.ncbi.nlm.nih.gov/blast/Blast.cgi? 
CMD=Web&LAYOUT=TwoWindows&AUTO_FORMAT=Semiauto&PAGE=Proteins&NCBI_GI=yes 
&HITLIST_SIZE=100&COMPOSITION_BASED_STATISTICS=yes&SHOW_OVERVIEW=yes&AUT 
O_FORMAT=yes&CDD_SEARCH=yes&FILTER=L&SHOW_LINKOUT=yes"         
class="HELPBAR"><FONT COLOR="#FFFFFF">Protein</FONT></a></td>    <td  
width="150" bgcolor="#003366">        <a href="http:// 
www.ncbi.nlm.nih.gov/blast/Blast.cgi? 
CMD=Web&LAYOUT=TwoWindows&AUTO_FORMAT=Semiauto&PAGE=Translations&NCBI_GI 
=yes&FILTER=L&HITLIST_SIZE=100&SHOW_OVERVIEW=yes&AUTO_FORMAT=yes&SHOW_LI 
NKOUT=yes"        class="HELPBAR"><FONT COLOR="#FFFFFF">Translations</ 
FONT></a></td>    <td width="150" bgcolor="#003366">        <a  
href="http://www.ncbi.nlm.nih.gov/blast/Blast.cgi? 
CMD=Web&LAYOUT=TwoWindows&AUTO_FORMAT=Semiauto&PAGE=Formating&NCBI_GI=ye 
s&SHOW_OVERVIEW=yes&AUTO_FORMAT=yes&SHOW_LINKOUT=yes"         
class="HELPBAR"><FONT COLOR="#FFFFFF">Retrieve results for an RID</ 
FONT></a></td></tr></table><br><!--  the contents   --> <form  
action="Blast.cgi" enctype="application/x-www-form-urlencoded"  
method="POST"><script src="blastcgi.js"></script><SCRIPT  
LANGUAGE="JavaScript"> <!--document.images['BlastHeaderGif'].src =  
'html/head_formating.gif';// --></SCRIPT><br><hr><font  
color="red">ERROR: Cannot accept request, error code: 1Number of  
unfinished requests (151)  from your IP address reached the HARD  
limit 150.</font><hr></form>   </body></html>
---------------------------------------------------

On Jun 8, 2006, at 6:12 AM, Chris Fields wrote:

> I would say, based on previous responses, update to the latest CVS
> (bioperl-live).  You could also try updating
> Bio::Tools::Run::RemoteBlast.pm and Bio::SearchIO::blast.pm if you
> don't want to update the entire toolkit.  Running these with BLAST
> 2.2.14 output seems to work fine.
>
> Though this is the likely fix, if you have additional problems next
> time please make sure to include more information.  We have no idea
> what OS, bioperl version, perl version you are running.  And a code
> snippet and bug description would be nice (i.e. "it doesn't work" -
> not a good description; "the script freezes" is a little more
> informative).
>
> Chris
>
> On Jun 8, 2006, at 6:38 AM, John Mifsud wrote:
>
>> Dear all,
>>
>> Firstly I hope this is the right email list to write to!
>>
>> Secondly, I have a little program that parses the BLAST results i
>> have got
>> running remotely to the NCBI server and takes out all the hit
>> sequences and
>> converts them to FASTA format.
>>
>> Now when using BROAD BLAST and getting results this works fine
>> (tblastn ver
>> 2.2.9). However, NCBI have just updated their BLAST server (to
>> 2.2.14) and
>> the output is different and the parsing no longer works. I was
>> wondering if
>> anyone knew of a new SearchIO module / script that is designed to
>> blast the
>> updated NCBI BLAST output?
>>
>> Thanks for your time,
>>
>>
>> John
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>





More information about the Bioperl-l mailing list