[Bioperl-l] feature additional tag order in GFF

Jason Stajich jason@cgt.mc.duke.edu
Fri, 18 Oct 2002 09:43:35 -0400 (EDT)


Because of the way Bio::DB::GFF gets its info about a feature's name, it
seems that it is best to have the 'Sequence' or 'Target' field first when
loading GFF data into the db so it can determine the name of the feature
properly (I'd also be happy to implement it so that the GFF tries to pull
the data from a Target/Sequence field first before relying on the 1st
field for that info -- Thoughts Lincoln?).

Currently when writing out GFF strings for a feature (SeqFeature::Generic)
we do foreach my $tag ( $feature->all_tags ) { .. } The order of the tags
as determined by all_tags is just straight from keys
%{$self->{_internal_hash}} so the order cannot be reliably determined
either based on insert order or name.

Does it make more sense to provide a Get/Set method
my @tags = $feature->tag_order( \@tagnames )
which allows us to specify the tag order (in the absence of being set we
rely on the old way of just grabbing the hash keys).

This has the side-effect of allowing one to NOT specify all of the tags in
that tag_order list which would effectively allow one to filter the tags
which are output -- which could be a good thing as with the blast2gff
script I am finishing off, bitscore,percentid,fraction_identitical, etc
are all stored as tags in the SeqFeature::Similarity object and we may not
really want to see all of them as additional tags in the GFF extra tag
section.

Thoughts?  I've implemented a local tag_order to serve my own purpose just
curious if people think this is a sensible broadly applicable method?

-jason
-- 
Jason Stajich
Duke University
jason at cgt.mc.duke.edu