[Bioperl-l] possible bug printing GenBank feature qualfiers
Hilmar Lapp
hlapp at gmx.net
Fri Mar 31 18:22:30 UTC 2006
Scott,
your fix assumes that $value in reality is not a scalar but a hash ref
and that it has a key "value".
Apparently in your test environment this is all indeed true, but there
is no guarantee that this will still be true tomorrow when you next
update from CVS (or install a new version).
It seems to me that making feature tag values Bio::AnnotationI objects
and the stringification overload is what is interfering here. More
specifically, the broken overload in Bio::Annotation::SimpleValue
use overload '""' => sub { $_[0]->value || ''};
will lead exactly to the behavior you see (b/c $_[0]->value evaluates
to false if the value is '0').
You say you build and populate the feature dynamically - are you using
Bio::SeqFeature::Annotated for this? Bio::SeqFeature::Generic is slated
to get this behavior reverted, i.e., will return to using scalars for
tag values. (Or so I recall ...)
To fix the problem for you now, I suggest you either fix the overload
statement above to be
use overload '""' => sub { defined($_[0]->value) ? $_[0]->value : '' };
I suppose this should in fact be committed to the repository - does
anybody see any damage from this change?
Or, if you do want to mess with the GenBank format writer, protect the
conversion to string and use the object access method:
if (ref($value) && $value->isa("Bio::Annotation::SimpleValue")) {
# convert SimpleValue object to represented (string) value
$value = $value->value;
}
Hth,
-hilmar
On Mar 31, 2006, at 9:31 AM, Scott Markel wrote:
> Chris,
>
> Looks like I made my test case too simple. In our application,
> which calls BioPerl, I'm creating the feature with the zero-
> valued qualifier. It's not being read in from a file, so
> my only issue is with writing GenBank files. The real feature
> is one for a primer binding site. The qualifier contains the
> number of mismatches. The one line change of
>
> $value = $value->{"value"}
>
> definitely fixes our problem and causes no regression
> failures in our application.
>
> Scott
>
> Chris Fields wrote:
>
>> I tried this on WinXP (I'm using bioperl-live) and got a warning:
>>
>> -------------------- WARNING ---------------------
>> MSG: Unexpected error in feature table for Skipping feature,
>> attempting to
>> recover
>> ---------------------------------------------------
>>
>> Running using debugging shows that no feature key was found in
>> _read_FTHelper_GenBank. So I'm getting an error, but on input not
>> output.
>> In fact, turning on -verbose in the SeqIO input object gives the
>> below extra
>> output, whereas turning -verbose on only in the output object just
>> gives the
>> warning above.
>>
>> ====================================
>> C:\Perl\Scripts\gb_test>test.pl
>> no feature key!
>>
>> -------------------- WARNING ---------------------
>> MSG: Unexpected error in feature table for Skipping feature,
>> attempting to
>> recover
>> STACK Bio::SeqIO::genbank::next_seq
>> C:\Perl\src\bioperl\core/Bio\SeqIO\genbank.pm:583
>> STACK toplevel C:\Perl\Scripts\gb_test\test.pl:18
>> sequence length is 10
>> ====================================
>>
>> The sequence came back w/o any features in the feature table, which
>> is what
>> I would expect from this error:
>> ====================================
>> LOCUS MY_LOCUS 10 aa linear linear
>> DEFINITION my description.
>> ACCESSION 12345
>> KEYWORDS .
>> FEATURES Location/Qualifiers
>> ORIGIN
>> 1 atggagaact
>> //
>> ====================================
>>
>> Adding the extra line before the s/// didn't help any (warning still
>> pops
>> up, no change in output). Anybody out there with any ideas?
>>
>> Christopher Fields
>> Postdoctoral Researcher - Switzer Lab
>> Dept. of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>>> -----Original Message-----
>>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>>> bounces at lists.open-bio.org] On Behalf Of Scott Markel
>>> Sent: Thursday, March 30, 2006 7:18 PM
>>> To: bioperl-l at lists.open-bio.org
>>> Subject: [Bioperl-l] possible bug printing GenBank feature qualfiers
>>>
>>> In our upgrade from BioPerl 1.4 to 1.5.1 we tripped over the
>>> following.
>>>
>>> Annotation tags used by Bio::SeqIO::FTHelper were strings and
>>> are now Bio::Annotation::SimpleValue. In the _print_GenBank_FTHelper
>>> subroutine of Bio::SeqIO::genbank the following code still
>>> assumes that tags are strings.
>>>
>>> foreach my $tag ( keys %{$fth->field} ) {
>>> foreach my $value ( @{$fth->field->{$tag}} ) {
>>> $value =~ s/\"/\"\"/g;
>>>
>>> If the tag value was a zero, an empty string is written.
>>>
>>> We think that
>>>
>>> $value = $value->{"value"};
>>>
>>> should be added before the s/// call.
>>>
>>> Here's our test case. Note that the qualifier value for "foo"
>>> is changed to an empty string.
>>>
>>> Input file
>>>
>>> ====================================
>>> LOCUS MY_LOCUS 10 aa linear UNK
>>> DEFINITION my description.
>>> ACCESSION 12345
>>> FEATURES Location/Qualifiers
>>> misc_feature 1..10
>>> /foo="0"
>>> ORIGIN
>>> 1 atggagaact
>>> //
>>> ====================================
>>>
>>> Perl code
>>> ====================================
>>> use strict;
>>> use warnings;
>>>
>>> use Bio::SeqIO;
>>>
>>> my $inputFilename = "input.gbff";
>>> my $outputFilename = "output.gbff";
>>>
>>> my $in = Bio::SeqIO->new(-file => $inputFilename,
>>> -format => "genbank");
>>> my $out = Bio::SeqIO->new(-file => ">$outputFilename",
>>> -format => "genbank");
>>>
>>> my $sequence = $in->next_seq();
>>> $out->write_seq($sequence);
>>> ====================================
>>>
>>> Output file
>>> ====================================
>>> LOCUS MY_LOCUS 10 aa linear
>>> linear
>>> DEFINITION my description.
>>> ACCESSION 12345
>>> KEYWORDS .
>>> FEATURES Location/Qualifiers
>>> misc_feature 1..10
>>> /foo=""
>>> ORIGIN
>>> 1 atggagaact
>>> //
>>> ====================================
>>>
>>> I'll add this to bugzilla, but first I want to make sure
>>> I'm not missing something obvious.
>>>
>>> Scott
>
> --
> Scott Markel, Ph.D.
> Principal Bioinformatics Architect email: smarkel at scitegic.com
> SciTegic Inc. mobile: +1 858 205 3653
> 9665 Chesapeake Drive, Suite 401 voice: +1 858 279 8800, ext. 253
> San Diego, CA 92123 fax: +1 858 279 8804
> USA web: http://www.scitegic.com
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
--
----------------------------------------------------------
: Hilmar Lapp -:- San Diego, CA -:- hlapp at gmx dot net :
----------------------------------------------------------
More information about the Bioperl-l
mailing list