[Bioperl-l] struggling with Bio::FeatureIO and Bio::SeqFeature::Annotated

Hilmar Lapp hlapp at gmx.net
Sat Jan 29 18:19:25 EST 2005


On Tuesday, January 25, 2005, at 01:45  AM, Allen Day wrote:

>>
>> Also, do you think it will be possible to convert the 
>> Bio::SeqFeature::Annotated features into persistent ones so that 
>> these can be stored in BioSQL ? I'll try to test that out today.
>
> no idea.  my guess is not without substantial effort.
>

There shouldn't be a problem to serialize them unless 
SeqFeature::Annotated does not implement SeqFeatureI.

The problem is rather that you will get them out in a slightly 
different fashion.

Provided my understanding of SeqFeature::Annotated is correct (which it 
may not be!) then all tags be treated (stored) equally as any others, 
unlike SeqFeature::Generic which has methods primary_tag and source_tag 
that store their values separately.

So, upon retrieval of such a feature you would probably have the 
primary_tag and source_tag values in the tag/value system as well. This 
may or may not be an issue.

Furthermore, SeqFeature::Annotated does away with tag/value plus 
annotation bundle and stores everything in the latter. Bioperl-db uses 
SeqFeature::AnnotationAdaptor to access a feature's tags and 
annotations as if there only was an annotation bundle, which is what 
SeqFeature::Annotated does too but AnnotationAdaptor assumes that the 
underlying SeqFeatureI implementation stores them separately. The 
result is that when you plug a SeqFeature::Annotated into 
SeqFeature::Annotation, every tag/value may be reported both by the 
plugged feature's get_tag_values() and annotation->get_Annotations() 
methods, which may lead to redundant storage (and retrieval).

So at worst you may get duplication of all tag/value pairs for a 
feature.

If you retrieve features directly (instead of automatically as those 
attached to the sequence you retrieved), then you may even be able to 
circumvent this problem by providing a SeqFeatureI factory that 
instantiates SeqFeature::Annotated instead of SeqFeature::Generic 
(which is the default). Bioperl-db will again set the tag/value 
properties through the AnnotationAdaptor, but if the plugged feature is 
a SeqFeature::Annotated instance, it may take care of the duplication 
because redundant set operations will probably overwrite the previous 
one (because everything is stored in the annotation bundle).

Bottom line is, provided SeqFeature::Annotated implements SeqFeatureI 
it will be stored - just the result may have some redundancy in the 
annotation and tags. To know exactly it would need to be debugged, 
which I think nobody's done yet.

Also, if I'm wrong w.r.t. SeqFeature::Annotated's behaviour, any 
education from its authors will be welcome ...

	-hilmar
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------




More information about the Bioperl-l mailing list