[Bioperl-l] possible bug printing GenBank feature qualfiers

Scott Markel smarkel at scitegic.com
Fri Mar 31 01:17:50 UTC 2006


In our upgrade from BioPerl 1.4 to 1.5.1 we tripped over the
following.

Annotation tags used by Bio::SeqIO::FTHelper were strings and
are now Bio::Annotation::SimpleValue.  In the _print_GenBank_FTHelper
subroutine of Bio::SeqIO::genbank the following code still
assumes that tags are strings.

    foreach my $tag ( keys %{$fth->field} ) {
        foreach my $value ( @{$fth->field->{$tag}} ) {
        $value =~ s/\"/\"\"/g;

If the tag value was a zero, an empty string is written.

We think that

            $value = $value->{"value"};

should be added before the s/// call.

Here's our test case.  Note that the qualifier value for "foo"
is changed to an empty string.

Input file

====================================
LOCUS       MY_LOCUS                  10 aa            linear   UNK
DEFINITION  my description.
ACCESSION   12345
FEATURES             Location/Qualifiers
      misc_feature    1..10
                      /foo="0"
ORIGIN
         1 atggagaact
//
====================================

Perl code
====================================
use strict;
use warnings;

use Bio::SeqIO;

my $inputFilename = "input.gbff";
my $outputFilename = "output.gbff";

my $in  = Bio::SeqIO->new(-file   => $inputFilename,
                           -format => "genbank");
my $out = Bio::SeqIO->new(-file => ">$outputFilename",
                           -format => "genbank");

my $sequence = $in->next_seq();
$out->write_seq($sequence);
====================================

Output file
====================================
LOCUS       MY_LOCUS                  10 aa            linear   linear
DEFINITION  my description.
ACCESSION   12345
KEYWORDS    .
FEATURES             Location/Qualifiers
      misc_feature    1..10
                      /foo=""
ORIGIN
         1 atggagaact
//
====================================

I'll add this to bugzilla, but first I want to make sure
I'm not missing something obvious.

Scott

-- 
Scott Markel, Ph.D.
Principal Bioinformatics Architect  email:  smarkel at scitegic.com
SciTegic Inc.                       mobile: +1 858 205 3653
9665 Chesapeake Drive, Suite 401    voice:  +1 858 279 8800, ext. 253
San Diego, CA 92123                 fax:    +1 858 279 8804
USA                                 web:    http://www.scitegic.com






More information about the Bioperl-l mailing list