[Bioperl-l] bioperl 1.4 SearchIO doesn't work parsing Blast output

Roger Hall rahall2 at ualr.edu
Thu Feb 9 20:09:52 UTC 2006


Guys - I'm looking at the error message:

MSG: no data for midline Query  1   WWWKWRW  7
STACK Bio::SearchIO::blast::next_result
/usr/lib/perl5/site_perl/5.8.6/Bio/SearchIO/blast.pm:1151
STACK toplevel
/home/Hubert/installed/eclipse/workspace/Database_Search/Blast.pl:21

This is my line of thought:
1. "no data for midline $_" is a unique message generated by blast.pm in one
location only at the point of a. reading three lines b. dropping lines with
spaces only c. identifying the Query, Midline, and Match lines (0 <= $i < 3)
2. There is a regexp match that fails in order to reach that error message
3. The $_ value "Query  1   WWWKWRW  7" should not fail the expression
4. It does anyway
5. I cannot find the value "Query  1   WWWKWRW  7" anywhere in the blast
reports

I suspect a newline/chomp/metacharacter issue. Not finding the string
anywhere has me thoroughly confused - I asked Hubert for the additional
file, assuming that I didn't have it.

My next thought is to write a quick script to test perl behavior on "Fedora
Core 9".

Thoughts?

Did I misread the issue entirely? :}

Roger


-----Original Message-----
From: bioperl-l-bounces at lists.open-bio.org
[mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Chris Fields
Sent: Thursday, February 09, 2006 10:16 AM
To: 'Jason Stajich'; 'Hubert Prielinger'
Cc: bioperl-l at bioperl.org
Subject: Re: [Bioperl-l] bioperl 1.4 SearchIO doesn't work parsing Blast
output


> -----Original Message-----
> From: Jason Stajich [mailto:jason.stajich at duke.edu] 
> Sent: Thursday, February 09, 2006 9:13 AM
> To: Hubert Prielinger
> Cc: Chris Fields; bioperl-l at bioperl.org
> Subject: Re: [Bioperl-l] bioperl 1.4 SearchIO doesn't work 
> parsing Blast output
> 
> On Feb 8, 2006, at 4:41 PM, Hubert Prielinger wrote:
> > hi chris,
> > thanks, I have upgraded to version 1.5.1 but it isn't still 
> working, 
> > do you have any ohter idea, the problem I have is that I 
> have to parse 
> > a lot of textfiles....
> > or shall I look for another option to parse those files...
> >
> > regards
> > Hubert
> 
> 
> The code from Bioperl 1.5.1 works fine for me for blast 
> 2.2.13 reports but unless you post your blast report we can't 
> really determine the problem.
> 
> If you are still getting the same error like this I am not 
> convinced you have upgraded to 1.5.1 which includes a fix in 
> the fact that NCBI changed the HSP result format to remove 
> the ':' from the Query/Sbjct prefixes.  We fixed this as soon 
> as it was apparent sometime in September.
> 
> >>> MSG: no data for midline Query  1   WWWKWRW  7
> >>> STACK Bio::SearchIO::blast::next_result
> >>> /usr/lib/perl5/site_perl/5.8.6/Bio/SearchIO/blast.pm:1151
> >>> STACK toplevel
> >>> 
> /home/Hubert/installed/eclipse/workspace/Database_Search/Blast.pl:21
> 
> If you are just getting no results but also no warnings wrt 
> parsing, are you sure your logic is correct?
> 
> If you remove your filters do you see all the HSPS?
> 
> 
> while (my $result = $search->next_result) {
>      print $result->query_name, "\n";
>      #iterate over each hit on the query sequence
>      while (my $hit = $result->next_hit) {
> 	print $hit->name, "\n";
>          #iterate over each HSP in the hit
>          while (my $hsp = $hit->next_hsp) {
> 	 print $hsp->evalue, " ", $hsp->length('sbjct'), " ", $hsp- 
>  >hit_string, "\n";	
>         }
>     }
> }

I tested some of the BLAST results that Hubert sent Roger and me with a
similar script to the above.  I removed the file parsing logic and it seemed
to work just fine.  It may very well be a logic issue or that he hasn't
installed the latest fix.
    
It's a funny thing, though.  When I tried using blastcl3 (v. 2.2.13), even
though the returned output was from nr, the top of the blast output showed
that it was v2.2.12:  

BLASTP 2.2.12 [Aug-07-2005]

I double-checked my local version and it's definitely v.2.2.13:
-------------------------------------
C:\Perl\Scripts>blastcl3 -

blastcl3 2.2.13   arguments:...
-------------------------------------

If you use RemoteBlast using the same settings, the version in the header
looks like this:

BLASTP 2.2.13 [Nov-27-2005]

I'm wondering if all the blast executables (blast and netblast) from NCBI
have text output like v.2.2.12, while the wwwblast outputs a new format
(2.2.13).  I'll ask blast-help at NCBI about this.

> 
> To clarify some stuff -
> Chris I don't necessarily think the XML is best way forward 
> for BLAST reports generated locally, it isn't as detailed as 
> the Text format and it is what most people expect to be able 
> to scroll through and parse -- it is also harder for the 
> format to change dramatically if you have a static binary on 
> your machine =).  I think for remoteblast the XML format 
> should be the way forward but I expect Bioperl to maintain 
> support of any plain text BLAST report format that people use 
> on a regular basis.
> 

Does XML lack some specific info that text output has?  Didn't know that.  I
believe that XML should be default in RemoteBlast since it will not break,
but I agree with you about text output.  I also agree that it will need
somebody to maintain it constantly, much like RemoteBlast.

> -jason
> >
> >
> > Chris Fields wrote:
> >
> >> My guess is you're running into text parsing problems in 
> >> Bio::SearchIO::blast.  Upgrade to the latest developer version
> >> (1.5.1) or
> >> bioperl-live (CVS), then see the bug below.
> >>
> >> http://bugzilla.bioperl.org/show_bug.cgi?id=1934
> >>
> >> I think the first problem you ran into is solved in bioperl 1.5.1, 
> >> the last problem (more recent, not related to the first) has been 
> >> fixed but hasn't been committed to bioperl-live yet.  The fixed 
> >> SearchIO::blast is available in the link above, but 
> realize it hasn't 
> >> been committed yet and may change.
> >>
> >> Christopher Fields
> >> Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry 
> >> University of Illinois Urbana-Champaign
> >>
> >>
> >>
> >>> -----Original Message-----
> >>> From: bioperl-l-bounces at lists.open-bio.org
> >>> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Hubert 
> >>> Prielinger
> >>> Sent: Wednesday, February 08, 2006 2:52 PM
> >>> To: bioperl-l at bioperl.org
> >>> Subject: [Bioperl-l] bioperl 1.4 SearchIO doesn't work 
> parsing Blast 
> >>> output
> >>>
> >>> Hi,
> >>> If I want to parse a Blast Output (Version 2.2.12) with 
> >>> Bio::SearchIO, I get the following error message:
> >>>
> >>> MSG: no data for midline Query  1   WWWKWRW  7
> >>> STACK Bio::SearchIO::blast::next_result
> >>> /usr/lib/perl5/site_perl/5.8.6/Bio/SearchIO/blast.pm:1151
> >>> STACK toplevel
> >>> 
> /home/Hubert/installed/eclipse/workspace/Database_Search/Blast.pl:21
> >>>
> >>> is that a bug......
> >>>
> >>> If I want to parse Blast Output (version 2.2.13), I don't get 
> >>> anything.....
> >>> I'm using bioperl 1.4
> >>>
> >>> before, I have installed bioperl 1.4, it worked fine 
> parsing Blast 
> >>> Output (version 2.2.12), but I don't remember which 
> bioperl version 
> >>> I had installed
> >>>
> >>> thanks in advance
> >>>
> >>> Hubert
> >>>
> >>>
> >>>
> >>> _______________________________________________
> >>> Bioperl-l mailing list
> >>> Bioperl-l at lists.open-bio.org
> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>>
> >>>
> >>
> >>
> >>
> >>
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> --
> Jason Stajich
> Duke University
> http://www.duke.edu/~jes12
> 

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign  

_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l




More information about the Bioperl-l mailing list