[Bioperl-l] struggling with Bio::FeatureIO and
Bio::SeqFeature::Annotated
Hilmar Lapp
hlapp at gmx.net
Sat Jan 29 18:19:25 EST 2005
On Tuesday, January 25, 2005, at 01:45 AM, Allen Day wrote:
>>
>> Also, do you think it will be possible to convert the
>> Bio::SeqFeature::Annotated features into persistent ones so that
>> these can be stored in BioSQL ? I'll try to test that out today.
>
> no idea. my guess is not without substantial effort.
>
There shouldn't be a problem to serialize them unless
SeqFeature::Annotated does not implement SeqFeatureI.
The problem is rather that you will get them out in a slightly
different fashion.
Provided my understanding of SeqFeature::Annotated is correct (which it
may not be!) then all tags be treated (stored) equally as any others,
unlike SeqFeature::Generic which has methods primary_tag and source_tag
that store their values separately.
So, upon retrieval of such a feature you would probably have the
primary_tag and source_tag values in the tag/value system as well. This
may or may not be an issue.
Furthermore, SeqFeature::Annotated does away with tag/value plus
annotation bundle and stores everything in the latter. Bioperl-db uses
SeqFeature::AnnotationAdaptor to access a feature's tags and
annotations as if there only was an annotation bundle, which is what
SeqFeature::Annotated does too but AnnotationAdaptor assumes that the
underlying SeqFeatureI implementation stores them separately. The
result is that when you plug a SeqFeature::Annotated into
SeqFeature::Annotation, every tag/value may be reported both by the
plugged feature's get_tag_values() and annotation->get_Annotations()
methods, which may lead to redundant storage (and retrieval).
So at worst you may get duplication of all tag/value pairs for a
feature.
If you retrieve features directly (instead of automatically as those
attached to the sequence you retrieved), then you may even be able to
circumvent this problem by providing a SeqFeatureI factory that
instantiates SeqFeature::Annotated instead of SeqFeature::Generic
(which is the default). Bioperl-db will again set the tag/value
properties through the AnnotationAdaptor, but if the plugged feature is
a SeqFeature::Annotated instance, it may take care of the duplication
because redundant set operations will probably overwrite the previous
one (because everything is stored in the annotation bundle).
Bottom line is, provided SeqFeature::Annotated implements SeqFeatureI
it will be stored - just the result may have some redundancy in the
annotation and tags. To know exactly it would need to be debugged,
which I think nobody's done yet.
Also, if I'm wrong w.r.t. SeqFeature::Annotated's behaviour, any
education from its authors will be welcome ...
-hilmar
--
-------------------------------------------------------------
Hilmar Lapp email: lapp at gnf.org
GNF, San Diego, Ca. 92121 phone: +1-858-812-1757
-------------------------------------------------------------
More information about the Bioperl-l
mailing list