[Bioperl-l] BLAST parsing broken
Razi Khaja
razi.khaja at gmail.com
Tue May 4 17:55:00 UTC 2010
That is odd. Heikki, do you have a blast output file that produces this
error?
Could you attach the file and either send to the list or myself (if the list
does not accept attachments).
Thanks,
Razi
On Mon, May 3, 2010 at 8:08 AM, Chris Fields <cjfields at illinois.edu> wrote:
> Odd, I ran tests on that prior to commit. I'll work on fixing that (in svn
> of course, until the migration is complete).
>
> chris
>
> On May 3, 2010, at 6:45 AM, Heikki Lehvaslaiho wrote:
>
> > Chris,
> >
> > latest additions to Bio::SearchIO::blast.pm broke the parsing of normal
> > blast output. $result->query_name returns now undef.
> >
> > (Using the anonymous git now). This change still works:
> >
> > commit 5e278f5dbb9afc4dc0359cd3fdc8fb0d0f4cad74
> > Author: cjfields <cjfields at eb9725d8-4842-0410-9bbb-c0b52e2da49b>
> > Date: Sun Dec 20 04:39:58 2009 +0000
> >
> > Robson's patch for buggy blastpgp output
> >
> > But this does not:
> >
> > commit 9a89c3434597104dd50553e3562983d78d14a544
> > Author: cjfields <cjfields at eb9725d8-4842-0410-9bbb-c0b52e2da49b>
> > Date: Thu Apr 15 04:21:17 2010 +0000
> >
> > [bug 3031]
> >
> > patches for catching algorithm ref, courtesy Razi Khaja.
> >
> > That makes it easy to find the diffs:
> >
> > $git diff 5e278f5dbb9afc4dc0359cd3fdc8fb0d0f4cad74
> > 9a89c3434597104dd50553e3562983d78d14a544 Bio/SearchIO/blast.pm
> > diff --git a/Bio/SearchIO/blast.pm b/Bio/SearchIO/blast.pm
> > index 378023a..6f7eeeb 100644
> > --- a/Bio/SearchIO/blast.pm
> > +++ b/Bio/SearchIO/blast.pm
> > @@ -209,6 +209,7 @@ BEGIN {
> >
> > 'BlastOutput_program' => 'RESULT-algorithm_name',
> > 'BlastOutput_version' => 'RESULT-algorithm_version',
> > + 'BlastOutput_algorithm-reference' =>
> 'RESULT-algorithm_reference',
> > 'BlastOutput_query-def' => 'RESULT-query_name',
> > 'BlastOutput_query-len' => 'RESULT-query_length',
> > 'BlastOutput_query-acc' => 'RESULT-query_accession',
> > @@ -504,6 +505,26 @@ sub next_result {
> > }
> > );
> > }
> > + # parse the BLAST algorithm reference
> > + elsif(/^Reference:\s+(.*)$/) {
> > + # want to preserve newlines for the BLAST algorithm
> reference
> > + my $algorithm_reference = "$1\n";
> > + $_ = $self->_readline;
> > + # while the current line, does not match an empty line, a
> RID:,
> > or a Database:, we are still looking at the
> > + # algorithm_reference, append it to what we parsed so far
> > + while($_ !~ /^$/ && $_ !~ /^RID:/ && $_ !~ /^Database:/) {
> > + $algorithm_reference .= "$_";
> > + $_ = $self->_readline;
> > + }
> > + # if we exited the while loop, we saw an empty line, a RID:,
> or
> > a Database:, so push it back
> > + $self->_pushback($_);
> > + $self->element(
> > + {
> > + 'Name' => 'BlastOutput_algorithm-reference',
> > + 'Data' => $algorithm_reference
> > + }
> > + );
> > + }
> > # added Windows workaround for bug 1985
> > elsif (/^(Searching|Results from round)/) {
> > next unless $1 =~ /Results from round/;
> >
> >
> > I am not sure why reference parsing messes things up. Maybe it eats too
> many
> > lines from the result file.
> >
> > Yours,
> >
> > -Heikki
> >
> > Heikki Lehvaslaiho - skype:heikki_lehvaslaiho
> > cell: +966 545 595 849 office: +966 2 808 2429
> >
> > Computational Bioscience Research Centre (CBRC), Building #2, Office
> #4216
> > 4700 King Abdullah University of Science and Technology (KAUST)
> > Thuwal 23955-6900, Kingdom of Saudi Arabia
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
More information about the Bioperl-l
mailing list