[Bioperl-l] AnnotationCollectionI and SeqFeatureI changes

Hilmar Lapp hlapp at gmx.net
Sat Nov 27 03:15:51 EST 2004


So, after checking out some pieces of code, AnnotatableI in particular, 
I'm not convinced that this transparency is achieved already; some more 
thoughts will have to be spent.

E.g., get_all_tags() will now return all annotation keys ever added, 
regardless of whether through $feat->primary_tag, $feat->source_tag, 
$feat->annotation->add_Annotation, or $feat->add_tag_value, unless any 
of these methods is overridden.

This is very different from what it used to be, namely only return 
those keys added through $feat->add_tag_value. Anybody who made any 
assumption that what you didn't add through add_tag_value you'll not 
get back through get_all_tags will be hosed. I.e., anybody who assumed 
that $feat->get_all_tags and $feat->annotation->get_all_annotation_keys 
hold distinct sets of values will be hosed. This includes bioperl-db, 
for those who care.

It is also not what SeqFeature::Generic will do now, namely not return 
the primary_tag and source_tag keys.

Allen, you changed the t/AnnotationAdaptor.t test in order to make it 
pass after the changes, and what you changed wasn't fixing a bug. This 
is cheating, quite frankly - you removed the evidence that the changes 
are not transparent. I suggest you roll back AnnotationAdaptor.t to the 
previous version and see that it passes the way it was.

Or this entire change is rolled back from the release trunk and ironed 
out on a branch first. I'm really not convinced that releasing a 
one-week old substantial change to some of the core modules to the 
general public is the best way of minimizing avoidable frustration on 
the users' end. Call me too conservative, but given previous fiascos 
with prematurely released code in bioperl I just can't help but feel 
this may be too risky an undertaking for a release that was originally 
fancied to supplant 1.4.

	-hilmar

On Friday, November 26, 2004, at 10:38  PM, Hilmar Lapp wrote:

> My worry was that to anybody who
>
> 	1) treated objects returned by $seq->get_SeqFeatures() as 
> implementing SeqFeatureI,
> 	2) assumed that instantiating SeqFeature::Generic would give him a 
> SeqFeatureI compliant object,
> 	3) wrote her own SeqFeatureI compliant object and treated it as such
>
> these changes should be fully transparent. By fully transparent I mean 
> that first the respective code should be no less correct than it was 
> before, and that the code should produce the exact same results, 
> including what is sent to standard error. By a SeqFeatureI compliant 
> object I mean an object that IS-A Bio::SeqFeatureI as it was defined 
> before those changes, with all calls to the previously defined 
> functions being perfectly legal. Also, a class that I wrote before 
> those changes to implement SeqFeatureI should now still implement 
> SeqFeatureI without me having to make any changes.
>
> If this is satisfied then I have no worries to this end. If this is 
> not satisfied then I think it needs to go on a branch or at least stay 
> out of 1.5. I don't understand the implications of Allen's comments 
> below enough to make the judgement as to whether transparency is 
> provided; Allen will need to make this call himself I'm afraid. 
> Unfortunately, I think we've never written tests thorough enough that 
> would let us infer this from everything passing or not. For instance, 
> if you change SeqFeatureI and SeqFeatureI::Generic simultaneously, 
> there is no way of telling whether your SeqFeatureI changes are 
> transparent.
>
> Beyond this, I do agree with Chris that there could be a performance 
> issue with richly annotated databanks. Tag values were simple strings; 
> using objects for all of them instead at feature creation and 
> population time may be too heavy. Also Chris, I thought the XML idea 
> sounded very interesting; did you have Data::Stag in mind to manage 
> it? It would still require a working XML parser installation, no?
>
> 	-hilmar
>
> On Wednesday, November 24, 2004, at 05:58  AM, Aaron J. Mackey wrote:
>
>>
>> Allen, thanks a ton for going the extra mile.  Hilmar, does this 
>> solution satisfy your worries a bit?
>>
>> Thanks again to everyone,
>>
>> -Aaron
>>
>> On Nov 23, 2004, at 9:16 PM, Allen Day wrote:
>>
>>> Fixed.  Here is a summary of what I did to make this happen.  I went 
>>> ahead
>>> and did the work necessary to make Bio::SeqFeatureI AnnotatableI 
>>> instead
>>> of being itself an AnnotationCollectionI.
>>>
>>> . Bio::SeqFeatureI inherits Bio::AnnotatableI NOT
>>>   Bio::AnnotationCollectionI
>>> . *_tag_* methods are in Bio::AnnotatableI, and internally defer to
>>>   Bio::AnnotatableI->annotation->some_analagous_mapped_function()
>>>   . method behavior is now more similar to original *_tag_* method
>>>     behavior ; tag "values" are now instantiated as
>>>     Bio::Annotation::SimpleValue objects by default, unless their 
>>> name
>>>     indicates they should be otherwise (e.g. tag name "comment" or
>>>     "dblink")
>>> . deprecation warnings commented until 1.6
>>> . Bio::AnnotatableI now keeps a tag->annotation_type registry to 
>>> allow
>>>   new tags to be created (see Bio::SeqFeature::AnnotationAdaptor).
>>>   . Bio::SeqFeature::AnnotationAdaptor is now not very useful, as 
>>> *_tag_*
>>>     methods map directly onto Bio::AnnotationI's
>>>     Bio::AnnotationCollectionI instance.
>>> . Unflattener and Unflattener2 tests pass with no changes.
>>> . All tests pass.
>>>
>>> -Allen
>>>
>>>
>>> On Tue, 23 Nov 2004, Chris Mungall wrote:
>>>
>>>>
>>>> Unflattener.t is failing because someone has messed up 
>>>> get_tagset_values()
>>>> - this is a convenience method I originally added to SeqFeatureI. 
>>>> I'm not
>>>> familiar enough with the new changes and AnnotationCollections to 
>>>> fix
>>>> this.
>>>>
>>>> Surely the onus has always been on the person making changes to 
>>>> make sure
>>>> the test suite passes before committing their changes? In which 
>>>> case, how
>>>> did these changes make it in in the first place?
>>>>
>>>> On Tue, 23 Nov 2004, Jason Stajich wrote:
>>>>
>>>>>
>>>>> On Nov 23, 2004, at 4:47 PM, Allen Day wrote:
>>>>>
>>>>>> On Tue, 23 Nov 2004, Jason Stajich wrote:
>>>>>>
>>>>>>> I think if we just don't issue deprecation warnings it will be 
>>>>>>> fine by
>>>>>>> me -- even if we are just calling the new subroutine under the 
>>>>>>> hood.
>>>>>>> Tests seem to pass although Unflattner.t is falling over today 
>>>>>>> not
>>>>>>> sure
>>>>>>> what is problem.
>>>>>>
>>>>>> that fails for me too, in addition to spewing out lots of
>>>>>> diagnotistics.
>>>>>> however, if you run 'make test_Unflattener2', it passes.  strange.
>>>>>>
>>>>> no it is Unflattner not Unflattner2
>>>>>
>>>>> % make test_Unflattener
>>>>> [SNIP OUT SOME STUFF]
>>>>>
>>>>> -------------------- WARNING ---------------------
>>>>> MSG: get_tagset_values() is deprecated.  use get_Annotations()
>>>>> ---------------------------------------------------
>>>>>
>>>>> ------------- EXCEPTION: Bio::Root::Exception -------------
>>>>> MSG: Abstract method "Bio::AnnotationCollectionI::get_Annotations" 
>>>>> is
>>>>> not implemented by package Bio::SeqFeature::Generic.
>>>>>
>>>>>
>>>>>> -allen
>>>>>>
>>>>>>>
>>>>>>> -jason
>>>>>>> On Nov 23, 2004, at 2:28 PM, Aaron J. Mackey wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>> On Friday, November 19, 2004, at 02:50  PM, Allen Day wrote:
>>>>>>>>>
>>>>>>>>>> * Bio::SeqFeatureI now ISA Bio::AnnotationCollectionI
>>>>>>>>>> * All Bio::SeqFeatureI *_tag_* methods have been moved to
>>>>>>>>>>   Bio::AnnotationCollectionI, marked as deprecated, and 
>>>>>>>>>> mapped to
>>>>>>>>>> their
>>>>>>>>>>   analogous and mostly pre-existing Bio::AnnotationCollectionI
>>>>>>>>>> methods.
>>>>>>>>>>
>>>>>>>>>>   Methods which were not in Bio::AnnotationCollectionI, but 
>>>>>>>>>> were i
>>>>>>>>>>   Bio::Annotation::Collection and were necessary for *_tag_* 
>>>>>>>>>> method
>>>>>>>>>>   remapping were created in Bio::AnnotationCollecitonI.
>>>>>>>>
>>>>>>>> I've been paying some attention to this, but thought that the 
>>>>>>>> changes
>>>>>>>> were only those required to get Bio::FeatureIO working (i.e.
>>>>>>>> recapitulate GFF3 logic) without hampering object usage; do our 
>>>>>>>> tests
>>>>>>>> pass with these changes in place?
>>>>>>>>
>>>>>>>> On Nov 23, 2004, at 2:12 PM, Jason Stajich wrote:
>>>>>>>>
>>>>>>>>> it has not been tagged yet.  I think Aaron is just really busy 
>>>>>>>>> on
>>>>>>>>> this front.
>>>>>>>>
>>>>>>>> I did tag the HEAD at RC1, so we could branch from there if we 
>>>>>>>> needed
>>>>>>>> to; if this is really the big bug-bear that Hilmar and Jason are
>>>>>>>> claiming, then I'd ask Allen to retract his patches that alter
>>>>>>>> interface definitions, and branch.
>>>>>>>>
>>>>>>>> And I was so hoping to get RC2 packaged up later today ...
>>>>>>>>
>>>>>>>> -Aaron
>>>>>>>>
>>>>>>>> --
>>>>>>>> Aaron J. Mackey, Ph.D.
>>>>>>>> Dept. of Biology, Goddard 212
>>>>>>>> University of Pennsylvania       email:  amackey at pcbi.upenn.edu
>>>>>>>> 415 S. University Avenue         office: 215-898-1205
>>>>>>>> Philadelphia, PA  19104-6017     fax:    215-746-6697
>>>>>>>>
>>>>>>>>
>>>>>>> --
>>>>>>> Jason Stajich
>>>>>>> jason.stajich at duke.edu
>>>>>>> http://www.duke.edu/~jes12/
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Bioperl-l mailing list
>>>>>>> Bioperl-l at portal.open-bio.org
>>>>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>>
>>>>>> _______________________________________________
>>>>>> Bioperl-l mailing list
>>>>>> Bioperl-l at portal.open-bio.org
>>>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>
>>>>>>
>>>>> --
>>>>> Jason Stajich
>>>>> jason.stajich at duke.edu
>>>>> http://www.duke.edu/~jes12/
>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at portal.open-bio.org
>>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>>
>>>>
>>>
>>>
>> --
>> Aaron J. Mackey, Ph.D.
>> Dept. of Biology, Goddard 212
>> University of Pennsylvania       email:  amackey at pcbi.upenn.edu
>> 415 S. University Avenue         office: 215-898-1205
>> Philadelphia, PA  19104-6017     fax:    215-746-6697
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at portal.open-bio.org
>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
> -- 
> -------------------------------------------------------------
> Hilmar Lapp                            email: lapp at gnf.org
> GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
> -------------------------------------------------------------
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
>
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------




More information about the Bioperl-l mailing list