[Bioperl-l] Bio::DB::SeqFeature to GFF mishandles attributes with multiple values

Cook, Malcolm MEC at stowers-institute.org
Fri Feb 23 15:54:57 UTC 2007


Lincoln, and other Bio::DB::SeqFeature wanderers:

I find that generating GFF from a Bio::DB::SeqFeature using gff3_string
does not respect the following:
 
"Multiple attributes of the same type are indicated by separating the
values with the comma "," character"  (c.f.
http://www.sequenceontology.org/gff3.shtml)
 
This one-liner demonstrates the problem:
 
perl -MBio::DB::SeqFeature -e 'print Bio::DB::SeqFeature->new(-seq_id =>
"J", -start => 1, -end => 2, -primary_tag => 'PH', -source => 'A',
-name => 'mec', -attributes => {foo =>  [qw(bar blat)]})->gff3_string'
J	A	PH	1	2	.	.	.
foo=bar;foo=blat;Name=mec

Do you agree this is a problem? 
 
The fix is in the post-sig patch to
/Bio/DB/SeqFeature/NormalizedFeature.pm, in which I also took the
stylistic privilege of promoting any ID, Parent, or Name attribute to
the front of column 9, so output is now:

J	A	PH	1	2	.	.	.
Name=mec;foo=bar,blat

Do you agree this is better?

I am poised to commit it, as well as the functionally same patch to the
equivilent function in Bio/Graphics/FeatureBase.pm

All clear?

-- Malcolm Cook

  
 
*** NormalizedFeature.pm 2 Feb 2007 21:05:42 -0000 1.25
--- NormalizedFeature.pm 23 Feb 2007 15:37:01 -0000
***************
*** 481,494 ****
      next if $t eq 'load_id';
      next if $t eq 'parent_id';
      foreach (@values) { s/\s+$// } # get rid of trailing whitespace
! 
!     push @result,join '=',$self->escape($t),$self->escape($_) foreach
@values;
    }
    my $id   = $self->primary_id;
    my $name = $self->display_name;
!   push @result,"ID=".$self->escape($id)                     if defined
$id;
!   push @result,"Parent=".$self->escape($parent->primary_id) if defined
$parent;
!   push @result,"Name=".$self->escape($name)                   if
defined $name;
    return join ';', at result;
  }
  
--- 481,498 ----
      next if $t eq 'load_id';
      next if $t eq 'parent_id';
      foreach (@values) { s/\s+$// } # get rid of trailing whitespace
!     
!      push @result,join '=',$self->escape($t),$self->escape($_) foreach
@values; 
!     # NO! Multiple attributes of the same type are indicated by
!     # separating the values with the comma "," character - per
!     # http://www.sequenceontology.org/gff3.shtml.  Do it this way:
!     #push @result,join '=',$self->escape($t),join(',', map
{$self->escape($_)} @values);
    }
    my $id   = $self->primary_id;
    my $name = $self->display_name;
!   unshift @result,"ID=".$self->escape($id)                     if
defined $id;
!   unshift @result,"Parent=".$self->escape($parent->primary_id) if
defined $parent;
!   unshift @result,"Name=".$self->escape($name)                   if
defined $name;
    return join ';', at result;
  }
 






More information about the Bioperl-l mailing list