[Bioperl-l] possible bug printing GenBank feature qualfiers

Hilmar Lapp hlapp at gmx.net
Fri Mar 31 19:51:20 UTC 2006


The only problem with Heikki's version of the line is that if the
value is undefined you get a (ugly) warning from perl stating that you
printed an undefined value. Since normally tags should always have a
value (even if an empty string) this is a rather theoretical issue.

   -hilmar

On 3/31/06, Chris Fields <cjfields at uiuc.edu> wrote:
> Sorry about that; stupid Outlook sent my mail before I had a chance to
> finish it up.
>
> The Bio::Annotation::Simple fix sounds best, but the problem is that CVS
> shows a fix on this line by Heikki after 1.5.1 was released:
>
>        fix to allow 0 values despite operator overload (Paul Mooney)
>
> which changed the overload to:
>
>          use overload '""' => sub { $_[0]->value};
>
> I'll try out your fix here to see if it breaks anything (can't see why it
> would), but I may need to dig through the archives a little to see why this
> latest change was made.  If everything works and passes tests I'll roll back
> the commit I made to Bio::SeqIO::genbank earlier today.
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
>
>
> > -----Original Message-----
> > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> > bounces at lists.open-bio.org] On Behalf Of Hilmar Lapp
> > Sent: Friday, March 31, 2006 12:23 PM
> > To: Scott Markel
> > Cc: bioperl-l at lists.open-bio.org; Chris Fields
> > Subject: Re: [Bioperl-l] possible bug printing GenBank feature qualfiers
> >
> > Scott,
> >
> > your fix assumes that $value in reality is not a scalar but a hash ref
> > and that it has a key "value".
> >
> > Apparently in your test environment this is all indeed true, but there
> > is no guarantee that this will still be true tomorrow when you next
> > update from CVS (or install a new version).
> >
> > It seems to me that making feature tag values Bio::AnnotationI objects
> > and the stringification overload is what is interfering here. More
> > specifically, the broken overload in Bio::Annotation::SimpleValue
> >
> >       use overload '""' => sub { $_[0]->value || ''};
> >
> > will lead exactly to the behavior you see (b/c $_[0]->value evaluates
> > to false if the value is '0').
> >
> > You say you build and populate the feature dynamically - are you using
> > Bio::SeqFeature::Annotated for this? Bio::SeqFeature::Generic is slated
> > to get this behavior reverted, i.e., will return to using scalars for
> > tag values. (Or so I recall ...)
> >
> > To fix the problem for you now, I suggest you either fix the overload
> > statement above to be
> >
> >       use overload '""' => sub { defined($_[0]->value) ? $_[0]->value : ''
> > };
> >
> > I suppose this should in fact be committed to the repository - does
> > anybody see any damage from this change?
> >
> > Or, if you do want to mess with the GenBank format writer, protect the
> > conversion to string and use the object access method:
> >
> >       if (ref($value) && $value->isa("Bio::Annotation::SimpleValue")) {
> >               # convert SimpleValue object to represented (string) value
> >               $value = $value->value;
> >       }
> >
> > Hth,
> >
> >       -hilmar
> >
> > On Mar 31, 2006, at 9:31 AM, Scott Markel wrote:
> >
> > > Chris,
> > >
> > > Looks like I made my test case too simple.  In our application,
> > > which calls BioPerl, I'm creating the feature with the zero-
> > > valued qualifier.  It's not being read in from a file, so
> > > my only issue is with writing GenBank files.  The real feature
> > > is one for a primer binding site.  The qualifier contains the
> > > number of mismatches.  The one line change of
> > >
> > >      $value = $value->{"value"}
> > >
> > > definitely fixes our problem and causes no regression
> > > failures in our application.
> > >
> > > Scott
> > >
> > > Chris Fields wrote:
> > >
> > >> I tried this on WinXP (I'm using bioperl-live) and got a warning:
> > >>
> > >> -------------------- WARNING ---------------------
> > >> MSG: Unexpected error in feature table for  Skipping feature,
> > >> attempting to
> > >> recover
> > >> ---------------------------------------------------
> > >>
> > >> Running using debugging shows that no feature key was found in
> > >> _read_FTHelper_GenBank.  So I'm getting an error, but on input not
> > >> output.
> > >> In fact, turning on -verbose in the SeqIO input object gives the
> > >> below extra
> > >> output, whereas turning -verbose on only in the output object just
> > >> gives the
> > >> warning above.
> > >>
> > >> ====================================
> > >> C:\Perl\Scripts\gb_test>test.pl
> > >> no feature key!
> > >>
> > >> -------------------- WARNING ---------------------
> > >> MSG: Unexpected error in feature table for  Skipping feature,
> > >> attempting to
> > >> recover
> > >> STACK Bio::SeqIO::genbank::next_seq
> > >> C:\Perl\src\bioperl\core/Bio\SeqIO\genbank.pm:583
> > >> STACK toplevel C:\Perl\Scripts\gb_test\test.pl:18
> > >> sequence length is 10
> > >> ====================================
> > >>
> > >> The sequence came back w/o any features in the feature table, which
> > >> is what
> > >> I would expect from this error:
> > >> ====================================
> > >> LOCUS       MY_LOCUS                  10 aa            linear   linear
> > >> DEFINITION  my description.
> > >> ACCESSION   12345
> > >> KEYWORDS    .
> > >> FEATURES             Location/Qualifiers
> > >> ORIGIN
> > >>         1 atggagaact
> > >> //
> > >> ====================================
> > >>
> > >> Adding the extra line before the s/// didn't help any (warning still
> > >> pops
> > >> up, no change in output).  Anybody out there with any ideas?
> > >>
> > >> Christopher Fields
> > >> Postdoctoral Researcher - Switzer Lab
> > >> Dept. of Biochemistry
> > >> University of Illinois Urbana-Champaign
> > >>
> > >>
> > >>
> > >>> -----Original Message-----
> > >>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> > >>> bounces at lists.open-bio.org] On Behalf Of Scott Markel
> > >>> Sent: Thursday, March 30, 2006 7:18 PM
> > >>> To: bioperl-l at lists.open-bio.org
> > >>> Subject: [Bioperl-l] possible bug printing GenBank feature qualfiers
> > >>>
> > >>> In our upgrade from BioPerl 1.4 to 1.5.1 we tripped over the
> > >>> following.
> > >>>
> > >>> Annotation tags used by Bio::SeqIO::FTHelper were strings and
> > >>> are now Bio::Annotation::SimpleValue.  In the _print_GenBank_FTHelper
> > >>> subroutine of Bio::SeqIO::genbank the following code still
> > >>> assumes that tags are strings.
> > >>>
> > >>>    foreach my $tag ( keys %{$fth->field} ) {
> > >>>        foreach my $value ( @{$fth->field->{$tag}} ) {
> > >>>        $value =~ s/\"/\"\"/g;
> > >>>
> > >>> If the tag value was a zero, an empty string is written.
> > >>>
> > >>> We think that
> > >>>
> > >>>            $value = $value->{"value"};
> > >>>
> > >>> should be added before the s/// call.
> > >>>
> > >>> Here's our test case.  Note that the qualifier value for "foo"
> > >>> is changed to an empty string.
> > >>>
> > >>> Input file
> > >>>
> > >>> ====================================
> > >>> LOCUS       MY_LOCUS                  10 aa            linear   UNK
> > >>> DEFINITION  my description.
> > >>> ACCESSION   12345
> > >>> FEATURES             Location/Qualifiers
> > >>>      misc_feature    1..10
> > >>>                      /foo="0"
> > >>> ORIGIN
> > >>>         1 atggagaact
> > >>> //
> > >>> ====================================
> > >>>
> > >>> Perl code
> > >>> ====================================
> > >>> use strict;
> > >>> use warnings;
> > >>>
> > >>> use Bio::SeqIO;
> > >>>
> > >>> my $inputFilename = "input.gbff";
> > >>> my $outputFilename = "output.gbff";
> > >>>
> > >>> my $in  = Bio::SeqIO->new(-file   => $inputFilename,
> > >>>                           -format => "genbank");
> > >>> my $out = Bio::SeqIO->new(-file => ">$outputFilename",
> > >>>                           -format => "genbank");
> > >>>
> > >>> my $sequence = $in->next_seq();
> > >>> $out->write_seq($sequence);
> > >>> ====================================
> > >>>
> > >>> Output file
> > >>> ====================================
> > >>> LOCUS       MY_LOCUS                  10 aa            linear
> > >>> linear
> > >>> DEFINITION  my description.
> > >>> ACCESSION   12345
> > >>> KEYWORDS    .
> > >>> FEATURES             Location/Qualifiers
> > >>>      misc_feature    1..10
> > >>>                      /foo=""
> > >>> ORIGIN
> > >>>         1 atggagaact
> > >>> //
> > >>> ====================================
> > >>>
> > >>> I'll add this to bugzilla, but first I want to make sure
> > >>> I'm not missing something obvious.
> > >>>
> > >>> Scott
> > >
> > > --
> > > Scott Markel, Ph.D.
> > > Principal Bioinformatics Architect  email:  smarkel at scitegic.com
> > > SciTegic Inc.                       mobile: +1 858 205 3653
> > > 9665 Chesapeake Drive, Suite 401    voice:  +1 858 279 8800, ext. 253
> > > San Diego, CA 92123                 fax:    +1 858 279 8804
> > > USA                                 web:    http://www.scitegic.com
> > >
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l at lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > >
> > >
> > --
> > ----------------------------------------------------------
> > : Hilmar Lapp -:- San Diego, CA -:- hlapp at gmx dot net :
> > ----------------------------------------------------------
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


--
----------------------------------------------------------
: Hilmar Lapp -:- San Diego, CA -:- hlapp at gmx dot net :
----------------------------------------------------------




More information about the Bioperl-l mailing list