[Bioperl-l] Bio::DB::SeqFeature to GFF mishandles attributes with multiple values
Cook, Malcolm
MEC at stowers-institute.org
Fri Feb 23 15:54:57 UTC 2007
Lincoln, and other Bio::DB::SeqFeature wanderers:
I find that generating GFF from a Bio::DB::SeqFeature using gff3_string
does not respect the following:
"Multiple attributes of the same type are indicated by separating the
values with the comma "," character" (c.f.
http://www.sequenceontology.org/gff3.shtml)
This one-liner demonstrates the problem:
perl -MBio::DB::SeqFeature -e 'print Bio::DB::SeqFeature->new(-seq_id =>
"J", -start => 1, -end => 2, -primary_tag => 'PH', -source => 'A',
-name => 'mec', -attributes => {foo => [qw(bar blat)]})->gff3_string'
J A PH 1 2 . . .
foo=bar;foo=blat;Name=mec
Do you agree this is a problem?
The fix is in the post-sig patch to
/Bio/DB/SeqFeature/NormalizedFeature.pm, in which I also took the
stylistic privilege of promoting any ID, Parent, or Name attribute to
the front of column 9, so output is now:
J A PH 1 2 . . .
Name=mec;foo=bar,blat
Do you agree this is better?
I am poised to commit it, as well as the functionally same patch to the
equivilent function in Bio/Graphics/FeatureBase.pm
All clear?
-- Malcolm Cook
*** NormalizedFeature.pm 2 Feb 2007 21:05:42 -0000 1.25
--- NormalizedFeature.pm 23 Feb 2007 15:37:01 -0000
***************
*** 481,494 ****
next if $t eq 'load_id';
next if $t eq 'parent_id';
foreach (@values) { s/\s+$// } # get rid of trailing whitespace
!
! push @result,join '=',$self->escape($t),$self->escape($_) foreach
@values;
}
my $id = $self->primary_id;
my $name = $self->display_name;
! push @result,"ID=".$self->escape($id) if defined
$id;
! push @result,"Parent=".$self->escape($parent->primary_id) if defined
$parent;
! push @result,"Name=".$self->escape($name) if
defined $name;
return join ';', at result;
}
--- 481,498 ----
next if $t eq 'load_id';
next if $t eq 'parent_id';
foreach (@values) { s/\s+$// } # get rid of trailing whitespace
!
! push @result,join '=',$self->escape($t),$self->escape($_) foreach
@values;
! # NO! Multiple attributes of the same type are indicated by
! # separating the values with the comma "," character - per
! # http://www.sequenceontology.org/gff3.shtml. Do it this way:
! #push @result,join '=',$self->escape($t),join(',', map
{$self->escape($_)} @values);
}
my $id = $self->primary_id;
my $name = $self->display_name;
! unshift @result,"ID=".$self->escape($id) if
defined $id;
! unshift @result,"Parent=".$self->escape($parent->primary_id) if
defined $parent;
! unshift @result,"Name=".$self->escape($name) if
defined $name;
return join ';', at result;
}
More information about the Bioperl-l
mailing list