[Bioperl-l] possible bug printing GenBank feature qualfiers

Chris Fields cjfields at uiuc.edu
Fri Mar 31 19:35:41 UTC 2006


Sorry about that; stupid Outlook sent my mail before I had a chance to
finish it up.

The Bio::Annotation::Simple fix sounds best, but the problem is that CVS
shows a fix on this line by Heikki after 1.5.1 was released:

       fix to allow 0 values despite operator overload (Paul Mooney)

which changed the overload to:

	 use overload '""' => sub { $_[0]->value};

I'll try out your fix here to see if it breaks anything (can't see why it
would), but I may need to dig through the archives a little to see why this
latest change was made.  If everything works and passes tests I'll roll back
the commit I made to Bio::SeqIO::genbank earlier today.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Hilmar Lapp
> Sent: Friday, March 31, 2006 12:23 PM
> To: Scott Markel
> Cc: bioperl-l at lists.open-bio.org; Chris Fields
> Subject: Re: [Bioperl-l] possible bug printing GenBank feature qualfiers
> 
> Scott,
> 
> your fix assumes that $value in reality is not a scalar but a hash ref
> and that it has a key "value".
> 
> Apparently in your test environment this is all indeed true, but there
> is no guarantee that this will still be true tomorrow when you next
> update from CVS (or install a new version).
> 
> It seems to me that making feature tag values Bio::AnnotationI objects
> and the stringification overload is what is interfering here. More
> specifically, the broken overload in Bio::Annotation::SimpleValue
> 
> 	use overload '""' => sub { $_[0]->value || ''};
> 
> will lead exactly to the behavior you see (b/c $_[0]->value evaluates
> to false if the value is '0').
> 
> You say you build and populate the feature dynamically - are you using
> Bio::SeqFeature::Annotated for this? Bio::SeqFeature::Generic is slated
> to get this behavior reverted, i.e., will return to using scalars for
> tag values. (Or so I recall ...)
> 
> To fix the problem for you now, I suggest you either fix the overload
> statement above to be
> 
> 	use overload '""' => sub { defined($_[0]->value) ? $_[0]->value : ''
> };
> 
> I suppose this should in fact be committed to the repository - does
> anybody see any damage from this change?
> 
> Or, if you do want to mess with the GenBank format writer, protect the
> conversion to string and use the object access method:
> 
> 	if (ref($value) && $value->isa("Bio::Annotation::SimpleValue")) {
> 		# convert SimpleValue object to represented (string) value
> 		$value = $value->value;
> 	}
> 
> Hth,
> 
> 	-hilmar
> 
> On Mar 31, 2006, at 9:31 AM, Scott Markel wrote:
> 
> > Chris,
> >
> > Looks like I made my test case too simple.  In our application,
> > which calls BioPerl, I'm creating the feature with the zero-
> > valued qualifier.  It's not being read in from a file, so
> > my only issue is with writing GenBank files.  The real feature
> > is one for a primer binding site.  The qualifier contains the
> > number of mismatches.  The one line change of
> >
> >      $value = $value->{"value"}
> >
> > definitely fixes our problem and causes no regression
> > failures in our application.
> >
> > Scott
> >
> > Chris Fields wrote:
> >
> >> I tried this on WinXP (I'm using bioperl-live) and got a warning:
> >>
> >> -------------------- WARNING ---------------------
> >> MSG: Unexpected error in feature table for  Skipping feature,
> >> attempting to
> >> recover
> >> ---------------------------------------------------
> >>
> >> Running using debugging shows that no feature key was found in
> >> _read_FTHelper_GenBank.  So I'm getting an error, but on input not
> >> output.
> >> In fact, turning on -verbose in the SeqIO input object gives the
> >> below extra
> >> output, whereas turning -verbose on only in the output object just
> >> gives the
> >> warning above.
> >>
> >> ====================================
> >> C:\Perl\Scripts\gb_test>test.pl
> >> no feature key!
> >>
> >> -------------------- WARNING ---------------------
> >> MSG: Unexpected error in feature table for  Skipping feature,
> >> attempting to
> >> recover
> >> STACK Bio::SeqIO::genbank::next_seq
> >> C:\Perl\src\bioperl\core/Bio\SeqIO\genbank.pm:583
> >> STACK toplevel C:\Perl\Scripts\gb_test\test.pl:18
> >> sequence length is 10
> >> ====================================
> >>
> >> The sequence came back w/o any features in the feature table, which
> >> is what
> >> I would expect from this error:
> >> ====================================
> >> LOCUS       MY_LOCUS                  10 aa            linear   linear
> >> DEFINITION  my description.
> >> ACCESSION   12345
> >> KEYWORDS    .
> >> FEATURES             Location/Qualifiers
> >> ORIGIN
> >>         1 atggagaact
> >> //
> >> ====================================
> >>
> >> Adding the extra line before the s/// didn't help any (warning still
> >> pops
> >> up, no change in output).  Anybody out there with any ideas?
> >>
> >> Christopher Fields
> >> Postdoctoral Researcher - Switzer Lab
> >> Dept. of Biochemistry
> >> University of Illinois Urbana-Champaign
> >>
> >>
> >>
> >>> -----Original Message-----
> >>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> >>> bounces at lists.open-bio.org] On Behalf Of Scott Markel
> >>> Sent: Thursday, March 30, 2006 7:18 PM
> >>> To: bioperl-l at lists.open-bio.org
> >>> Subject: [Bioperl-l] possible bug printing GenBank feature qualfiers
> >>>
> >>> In our upgrade from BioPerl 1.4 to 1.5.1 we tripped over the
> >>> following.
> >>>
> >>> Annotation tags used by Bio::SeqIO::FTHelper were strings and
> >>> are now Bio::Annotation::SimpleValue.  In the _print_GenBank_FTHelper
> >>> subroutine of Bio::SeqIO::genbank the following code still
> >>> assumes that tags are strings.
> >>>
> >>>    foreach my $tag ( keys %{$fth->field} ) {
> >>>        foreach my $value ( @{$fth->field->{$tag}} ) {
> >>>        $value =~ s/\"/\"\"/g;
> >>>
> >>> If the tag value was a zero, an empty string is written.
> >>>
> >>> We think that
> >>>
> >>>            $value = $value->{"value"};
> >>>
> >>> should be added before the s/// call.
> >>>
> >>> Here's our test case.  Note that the qualifier value for "foo"
> >>> is changed to an empty string.
> >>>
> >>> Input file
> >>>
> >>> ====================================
> >>> LOCUS       MY_LOCUS                  10 aa            linear   UNK
> >>> DEFINITION  my description.
> >>> ACCESSION   12345
> >>> FEATURES             Location/Qualifiers
> >>>      misc_feature    1..10
> >>>                      /foo="0"
> >>> ORIGIN
> >>>         1 atggagaact
> >>> //
> >>> ====================================
> >>>
> >>> Perl code
> >>> ====================================
> >>> use strict;
> >>> use warnings;
> >>>
> >>> use Bio::SeqIO;
> >>>
> >>> my $inputFilename = "input.gbff";
> >>> my $outputFilename = "output.gbff";
> >>>
> >>> my $in  = Bio::SeqIO->new(-file   => $inputFilename,
> >>>                           -format => "genbank");
> >>> my $out = Bio::SeqIO->new(-file => ">$outputFilename",
> >>>                           -format => "genbank");
> >>>
> >>> my $sequence = $in->next_seq();
> >>> $out->write_seq($sequence);
> >>> ====================================
> >>>
> >>> Output file
> >>> ====================================
> >>> LOCUS       MY_LOCUS                  10 aa            linear
> >>> linear
> >>> DEFINITION  my description.
> >>> ACCESSION   12345
> >>> KEYWORDS    .
> >>> FEATURES             Location/Qualifiers
> >>>      misc_feature    1..10
> >>>                      /foo=""
> >>> ORIGIN
> >>>         1 atggagaact
> >>> //
> >>> ====================================
> >>>
> >>> I'll add this to bugzilla, but first I want to make sure
> >>> I'm not missing something obvious.
> >>>
> >>> Scott
> >
> > --
> > Scott Markel, Ph.D.
> > Principal Bioinformatics Architect  email:  smarkel at scitegic.com
> > SciTegic Inc.                       mobile: +1 858 205 3653
> > 9665 Chesapeake Drive, Suite 401    voice:  +1 858 279 8800, ext. 253
> > San Diego, CA 92123                 fax:    +1 858 279 8804
> > USA                                 web:    http://www.scitegic.com
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> >
> --
> ----------------------------------------------------------
> : Hilmar Lapp -:- San Diego, CA -:- hlapp at gmx dot net :
> ----------------------------------------------------------
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l




More information about the Bioperl-l mailing list