[Bioperl-l] identifier interface
Matthew Pocock
matthew_pocock@yahoo.co.uk
Wed, 17 Jul 2002 22:05:19 +0100
Hi Lincoln,
Sory - probably not being crystal clear. I totaly get that identifiers
can usefully be seperate from locators, and completely agree that
resolution of identifiers to resources should be done by external code.
What I was saying is:
Do you want to make all identifiers in BioPerl conform to the LSID spec?
What if an ID provider (someone producing a BioPerl object with an
identifier) wants to use some other form of ID e.g.:
* database URN, table name, unique key
* some custom URN
* emboss ID
* LDAP path
These are just some silly ideas. No doubt real implementors will want to
use even funkier info to locate or uniquely identify their resources.
Nearly every effective naming scheim I have ever seen has been
hierachial (like file paths, domain names, LDAP). So, does it make sence
to expect all ID providers to fit their identifying info into an
LSID-shaped object, or should the only contracts on Identifier be that:
a) they can be losslessly read from/written to a string so that you
can serialize them
b) when fed to the Identifier resolving machinery, they can be used
to retrieve the referent they identify
The resolver can contain a hash or some code that effectively does the
switch/case/if_else on Identifier implementation class and uses the
appropreate factory to regenerate referents. It's no more scary than
resolving ftp, http, mail to different URL handler classes.
I'm just a bit worried that we are going to over-constrain what legal
identifiers are and make it hard for people to use the general framework
for something that is less than a 95% match to what we originaly
envisioned IDs being.
Is that any clearer?
Matthew
Lincoln Stein wrote:
> Matt, could you clarify what you are asking?
>
> It is important to separate the concept of an identifier (and its correlates,
> the identifier namespace and the collection of identifiers), from the
> mechanism for resolving an identifier and regenerating its referent. The
> analogous situation is domains names, in which there is insufficient
> information to resolve a domain name into a IP address, and a separate
> protocol, the DNS, is called for. The nice thing about the LSID format is
> that the details of what goes into the object identifier field is left up to
> the naming authority, and so it can carry whatever information is necessary
> for the naming authority to resolve it.
>
> There is a separate protocol for resolving LSIDs into resources, which the I3C
> is working on. The draft that I looked at was pretty vague.
>
> Lincoln
>