[Bioperl-l] possible bug printing GenBank feature qualfiers
Chris Fields
cjfields at uiuc.edu
Fri Mar 31 19:35:41 UTC 2006
Sorry about that; stupid Outlook sent my mail before I had a chance to
finish it up.
The Bio::Annotation::Simple fix sounds best, but the problem is that CVS
shows a fix on this line by Heikki after 1.5.1 was released:
fix to allow 0 values despite operator overload (Paul Mooney)
which changed the overload to:
use overload '""' => sub { $_[0]->value};
I'll try out your fix here to see if it breaks anything (can't see why it
would), but I may need to dig through the archives a little to see why this
latest change was made. If everything works and passes tests I'll roll back
the commit I made to Bio::SeqIO::genbank earlier today.
Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign
> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Hilmar Lapp
> Sent: Friday, March 31, 2006 12:23 PM
> To: Scott Markel
> Cc: bioperl-l at lists.open-bio.org; Chris Fields
> Subject: Re: [Bioperl-l] possible bug printing GenBank feature qualfiers
>
> Scott,
>
> your fix assumes that $value in reality is not a scalar but a hash ref
> and that it has a key "value".
>
> Apparently in your test environment this is all indeed true, but there
> is no guarantee that this will still be true tomorrow when you next
> update from CVS (or install a new version).
>
> It seems to me that making feature tag values Bio::AnnotationI objects
> and the stringification overload is what is interfering here. More
> specifically, the broken overload in Bio::Annotation::SimpleValue
>
> use overload '""' => sub { $_[0]->value || ''};
>
> will lead exactly to the behavior you see (b/c $_[0]->value evaluates
> to false if the value is '0').
>
> You say you build and populate the feature dynamically - are you using
> Bio::SeqFeature::Annotated for this? Bio::SeqFeature::Generic is slated
> to get this behavior reverted, i.e., will return to using scalars for
> tag values. (Or so I recall ...)
>
> To fix the problem for you now, I suggest you either fix the overload
> statement above to be
>
> use overload '""' => sub { defined($_[0]->value) ? $_[0]->value : ''
> };
>
> I suppose this should in fact be committed to the repository - does
> anybody see any damage from this change?
>
> Or, if you do want to mess with the GenBank format writer, protect the
> conversion to string and use the object access method:
>
> if (ref($value) && $value->isa("Bio::Annotation::SimpleValue")) {
> # convert SimpleValue object to represented (string) value
> $value = $value->value;
> }
>
> Hth,
>
> -hilmar
>
> On Mar 31, 2006, at 9:31 AM, Scott Markel wrote:
>
> > Chris,
> >
> > Looks like I made my test case too simple. In our application,
> > which calls BioPerl, I'm creating the feature with the zero-
> > valued qualifier. It's not being read in from a file, so
> > my only issue is with writing GenBank files. The real feature
> > is one for a primer binding site. The qualifier contains the
> > number of mismatches. The one line change of
> >
> > $value = $value->{"value"}
> >
> > definitely fixes our problem and causes no regression
> > failures in our application.
> >
> > Scott
> >
> > Chris Fields wrote:
> >
> >> I tried this on WinXP (I'm using bioperl-live) and got a warning:
> >>
> >> -------------------- WARNING ---------------------
> >> MSG: Unexpected error in feature table for Skipping feature,
> >> attempting to
> >> recover
> >> ---------------------------------------------------
> >>
> >> Running using debugging shows that no feature key was found in
> >> _read_FTHelper_GenBank. So I'm getting an error, but on input not
> >> output.
> >> In fact, turning on -verbose in the SeqIO input object gives the
> >> below extra
> >> output, whereas turning -verbose on only in the output object just
> >> gives the
> >> warning above.
> >>
> >> ====================================
> >> C:\Perl\Scripts\gb_test>test.pl
> >> no feature key!
> >>
> >> -------------------- WARNING ---------------------
> >> MSG: Unexpected error in feature table for Skipping feature,
> >> attempting to
> >> recover
> >> STACK Bio::SeqIO::genbank::next_seq
> >> C:\Perl\src\bioperl\core/Bio\SeqIO\genbank.pm:583
> >> STACK toplevel C:\Perl\Scripts\gb_test\test.pl:18
> >> sequence length is 10
> >> ====================================
> >>
> >> The sequence came back w/o any features in the feature table, which
> >> is what
> >> I would expect from this error:
> >> ====================================
> >> LOCUS MY_LOCUS 10 aa linear linear
> >> DEFINITION my description.
> >> ACCESSION 12345
> >> KEYWORDS .
> >> FEATURES Location/Qualifiers
> >> ORIGIN
> >> 1 atggagaact
> >> //
> >> ====================================
> >>
> >> Adding the extra line before the s/// didn't help any (warning still
> >> pops
> >> up, no change in output). Anybody out there with any ideas?
> >>
> >> Christopher Fields
> >> Postdoctoral Researcher - Switzer Lab
> >> Dept. of Biochemistry
> >> University of Illinois Urbana-Champaign
> >>
> >>
> >>
> >>> -----Original Message-----
> >>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> >>> bounces at lists.open-bio.org] On Behalf Of Scott Markel
> >>> Sent: Thursday, March 30, 2006 7:18 PM
> >>> To: bioperl-l at lists.open-bio.org
> >>> Subject: [Bioperl-l] possible bug printing GenBank feature qualfiers
> >>>
> >>> In our upgrade from BioPerl 1.4 to 1.5.1 we tripped over the
> >>> following.
> >>>
> >>> Annotation tags used by Bio::SeqIO::FTHelper were strings and
> >>> are now Bio::Annotation::SimpleValue. In the _print_GenBank_FTHelper
> >>> subroutine of Bio::SeqIO::genbank the following code still
> >>> assumes that tags are strings.
> >>>
> >>> foreach my $tag ( keys %{$fth->field} ) {
> >>> foreach my $value ( @{$fth->field->{$tag}} ) {
> >>> $value =~ s/\"/\"\"/g;
> >>>
> >>> If the tag value was a zero, an empty string is written.
> >>>
> >>> We think that
> >>>
> >>> $value = $value->{"value"};
> >>>
> >>> should be added before the s/// call.
> >>>
> >>> Here's our test case. Note that the qualifier value for "foo"
> >>> is changed to an empty string.
> >>>
> >>> Input file
> >>>
> >>> ====================================
> >>> LOCUS MY_LOCUS 10 aa linear UNK
> >>> DEFINITION my description.
> >>> ACCESSION 12345
> >>> FEATURES Location/Qualifiers
> >>> misc_feature 1..10
> >>> /foo="0"
> >>> ORIGIN
> >>> 1 atggagaact
> >>> //
> >>> ====================================
> >>>
> >>> Perl code
> >>> ====================================
> >>> use strict;
> >>> use warnings;
> >>>
> >>> use Bio::SeqIO;
> >>>
> >>> my $inputFilename = "input.gbff";
> >>> my $outputFilename = "output.gbff";
> >>>
> >>> my $in = Bio::SeqIO->new(-file => $inputFilename,
> >>> -format => "genbank");
> >>> my $out = Bio::SeqIO->new(-file => ">$outputFilename",
> >>> -format => "genbank");
> >>>
> >>> my $sequence = $in->next_seq();
> >>> $out->write_seq($sequence);
> >>> ====================================
> >>>
> >>> Output file
> >>> ====================================
> >>> LOCUS MY_LOCUS 10 aa linear
> >>> linear
> >>> DEFINITION my description.
> >>> ACCESSION 12345
> >>> KEYWORDS .
> >>> FEATURES Location/Qualifiers
> >>> misc_feature 1..10
> >>> /foo=""
> >>> ORIGIN
> >>> 1 atggagaact
> >>> //
> >>> ====================================
> >>>
> >>> I'll add this to bugzilla, but first I want to make sure
> >>> I'm not missing something obvious.
> >>>
> >>> Scott
> >
> > --
> > Scott Markel, Ph.D.
> > Principal Bioinformatics Architect email: smarkel at scitegic.com
> > SciTegic Inc. mobile: +1 858 205 3653
> > 9665 Chesapeake Drive, Suite 401 voice: +1 858 279 8800, ext. 253
> > San Diego, CA 92123 fax: +1 858 279 8804
> > USA web: http://www.scitegic.com
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> >
> --
> ----------------------------------------------------------
> : Hilmar Lapp -:- San Diego, CA -:- hlapp at gmx dot net :
> ----------------------------------------------------------
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
More information about the Bioperl-l
mailing list