[Bioperl-l] NCBI BLAST results parsing
Thomas J Keller
kellert at ohsu.edu
Thu Jun 8 18:39:04 UTC 2006
I'm having the same problem bp_remote_blast.pl worked yesterday,
today it's busted. Incidently, I got the following email from NCBI
this morning:
The new version of the NCBI SOAP E-Utilities, which includes recent
changes to the NCBI sequence databases schema, was released today.
Thank you.
NCBI E-Utilities Team
I wouldn't have thought that that would affect
Bio::Tools::RemoteBlast but something has changed.
Here's a snippet of the output after $ bp_remote_blast.pl -p blastn -
d nr -e 1e-3 -i nm_008540.fasta
-------------------- WARNING ---------------------
MSG: req was POST http://www.ncbi.nlm.nih.gov/blast/Blast.cgi
User-Agent: bioperl-Bio_Tools_Run_RemoteBlast/1.4
Content-Length: 267
Content-Type: application/x-www-form-urlencoded
DATABASE=nr&COMPOSITION_BASED_STATISTICS=off&QUERY=%3ENM_008540_2927+%
25GC+55.0+Score+5+Mus+musculus+MAD+homolog+4+(Drosophila)+(Smad4)%2C
+mRNA.%
0Acactctgcctgctgcttcactgt&EXPECT=1e-3&SERVICE=plain&FORMAT_OBJECT=Alignm
ent&CMD=Put&FILTER=L&CDD_SEARCH=off&PROGRAM=blastn
---------------------------------------------------
-------------------- WARNING ---------------------
MSG: <html><head><title>NCBI Blast</title><meta http-equiv="Content-
Type" content="text/html; charset=utf-8"/><link rel="stylesheet"
href="http://www.ncbi.nlm.nih.gov/corehtml/ncbi.css"></head><body
bgcolor="#FFFFFF" text="#000000" link="#CC6600" vlink="#CC6600"
onload="StartBlastCgi();"><!-- the header --> <table border="0"
width="600" cellspacing="0" cellpadding="0"><tr> <td width="600"
colspan=4> <map name="head_img_map"> <area shape="rect"
coords="0,0,300,40" href="http://www.ncbi.nlm.nih.gov" alt="NCBI home
page"> <area shape="rect" coords="301,0,600,40" href="http://
www.ncbi.nlm.nih.gov/blast" alt="NCBI BLAST home page"> </map>
<IMG SRC="html/blastheader.gif" USEMAP="#head_img_map" BORDER="0"
NAME="BlastHeaderGif" ALT="BLAST header image" WIDTH="600"
HEIGHT="45" BORDER="0" ALIGN="middle"> </td></tr><tr
align="center"> <td width="150" bgcolor="#003366"> <a
href="http://www.ncbi.nlm.nih.gov/blast/Blast.cgi?
CMD=Web&LAYOUT=TwoWindows&AUTO_FORMAT=Semiauto&PAGE=Nucleotides&NCBI_GI=
yes&FILTER=L&HITLIST_SIZE=100&SHOW_OVERVIEW=yes&AUTO_FORMAT=yes&SHOW_LIN
KOUT=yes" class="HELPBAR"><FONT COLOR="#FFFFFF">Nucleotide</
FONT></a></td> <td width="150" bgcolor="#003366"> <a
href="http://www.ncbi.nlm.nih.gov/blast/Blast.cgi?
CMD=Web&LAYOUT=TwoWindows&AUTO_FORMAT=Semiauto&PAGE=Proteins&NCBI_GI=yes
&HITLIST_SIZE=100&COMPOSITION_BASED_STATISTICS=yes&SHOW_OVERVIEW=yes&AUT
O_FORMAT=yes&CDD_SEARCH=yes&FILTER=L&SHOW_LINKOUT=yes"
class="HELPBAR"><FONT COLOR="#FFFFFF">Protein</FONT></a></td> <td
width="150" bgcolor="#003366"> <a href="http://
www.ncbi.nlm.nih.gov/blast/Blast.cgi?
CMD=Web&LAYOUT=TwoWindows&AUTO_FORMAT=Semiauto&PAGE=Translations&NCBI_GI
=yes&FILTER=L&HITLIST_SIZE=100&SHOW_OVERVIEW=yes&AUTO_FORMAT=yes&SHOW_LI
NKOUT=yes" class="HELPBAR"><FONT COLOR="#FFFFFF">Translations</
FONT></a></td> <td width="150" bgcolor="#003366"> <a
href="http://www.ncbi.nlm.nih.gov/blast/Blast.cgi?
CMD=Web&LAYOUT=TwoWindows&AUTO_FORMAT=Semiauto&PAGE=Formating&NCBI_GI=ye
s&SHOW_OVERVIEW=yes&AUTO_FORMAT=yes&SHOW_LINKOUT=yes"
class="HELPBAR"><FONT COLOR="#FFFFFF">Retrieve results for an RID</
FONT></a></td></tr></table><br><!-- the contents --> <form
action="Blast.cgi" enctype="application/x-www-form-urlencoded"
method="POST"><script src="blastcgi.js"></script><SCRIPT
LANGUAGE="JavaScript"> <!--document.images['BlastHeaderGif'].src =
'html/head_formating.gif';// --></SCRIPT><br><hr><font
color="red">ERROR: Cannot accept request, error code: 1Number of
unfinished requests (151) from your IP address reached the HARD
limit 150.</font><hr></form> </body></html>
---------------------------------------------------
On Jun 8, 2006, at 6:12 AM, Chris Fields wrote:
> I would say, based on previous responses, update to the latest CVS
> (bioperl-live). You could also try updating
> Bio::Tools::Run::RemoteBlast.pm and Bio::SearchIO::blast.pm if you
> don't want to update the entire toolkit. Running these with BLAST
> 2.2.14 output seems to work fine.
>
> Though this is the likely fix, if you have additional problems next
> time please make sure to include more information. We have no idea
> what OS, bioperl version, perl version you are running. And a code
> snippet and bug description would be nice (i.e. "it doesn't work" -
> not a good description; "the script freezes" is a little more
> informative).
>
> Chris
>
> On Jun 8, 2006, at 6:38 AM, John Mifsud wrote:
>
>> Dear all,
>>
>> Firstly I hope this is the right email list to write to!
>>
>> Secondly, I have a little program that parses the BLAST results i
>> have got
>> running remotely to the NCBI server and takes out all the hit
>> sequences and
>> converts them to FASTA format.
>>
>> Now when using BROAD BLAST and getting results this works fine
>> (tblastn ver
>> 2.2.9). However, NCBI have just updated their BLAST server (to
>> 2.2.14) and
>> the output is different and the parsing no longer works. I was
>> wondering if
>> anyone knew of a new SearchIO module / script that is designed to
>> blast the
>> updated NCBI BLAST output?
>>
>> Thanks for your time,
>>
>>
>> John
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
More information about the Bioperl-l
mailing list