[Bioperl-l] WARNING INCOMING: collection consolidation

Paul Edlefsen pedlefsen at systemsbiology.org
Thu Feb 27 11:25:39 EST 2003


Ewan Birney wrote:

>  Am I right in thinking that one of your classes is:
>
>
>Uniquely-Identifiable-Object-For-This-Implementation-but-not-exportable-ids
>
>and the other one is
>
>Uniquely-Identifiable-Object-For-Planet-Bioinformatics-and-so-exportable/queryable-ids
>
>If I am right, what are your object names? If I am wrong... can you
>enlighten me...?
>  
>
Yes, that's right.  They are called (and this could change if the will 
of the people desires it) LocallyIdentifiableI and 
GloballyIdentifiableI.  I would have called LocallyIdentifiableI just 
'IdentifiableI', but that's taken, so this will do.  It just has a 
'unique_id' method, which *must be undef if the object cannot provide a 
_unique_ identifier*.  The goal is to have something that the programmer 
can use instead of memory references to identify objects that are 
presently in use in a program.  So a SeqFeature's (or a RelRange's, etc) 
seq_id might be the unique_id of a sequence.  If somebody is able to 
further guarantee that this unique_id is 
For-Planet-Bioinformatics-unique, great.  That's where 
GloballyIdentifiableI comes in.  My concern with the existing 
IdentifiableI interface was that not all objects are globally 
identifiable, but most are locally unique, so requiring that all 
identifiable things be globally identifiable ensures that most things 
won't implement IdentifiableI (or at least won't do so properly).

There's a couple of cans of worms that I don't want to open right now.

One is what globally identifiable thing to use.  The existing 
IdentifiableI uses LSIDs.  That's a fine thing and is one way in which 
an object might be able to provide a global identifier.  The new 
IdentifiableI ISA GloabllyIdentifiableI and its unique_id method just 
returns the LSID string.  One goal of making GloballyIdentifiableI just 
have unique_id, like LocallyIdentifiableI, but document the assertion 
that *this* unique_id will allow folks to look up the object in the 
Planet-Bioinformatics realm, is that people might differ on their 
favorite type of global identifier, or different objects might require 
different sorts.  Particulars stay out of these interfaces.

Another is "shouldn't *everything* be globally identifiable?"  Yes, I 
suppose everything should.  Again, I don't think that it really matters 
to the point at hand, which is that many classes in BioPerl presently 
have some field which is asserted by the documentation to be unique. 
 The name of this field is different in different classes, and in some 
cases it gets confusing (I still don't understand the situation in 
SeqFeatureI -- maybe I'm thick?).  One problem at a time, I say; first 
we unify the existing notion of a unique identifier (which is, most 
often, not-necessarily-global), then we allow people to assert that some 
are indeed global, and then if the world community unifies around some 
global identifiers then maybe one day all of our objects will be 
GloballyIdentifiable.  That'd be awesome.

:Paul






More information about the Bioperl-l mailing list