[Open-bio-l] LSIDs

Fri Apr 4 09:46:37 EST 2003

On Fri, Apr 04, 2003 at 12:36:30AM -0800, Brian King wrote:
> > 	Comments are needed on this one!! Please read
> > below!
>  
> What is the purpose of format/alphabet in the proposed
> format?
> 
> URN:LSID:open-bio.org:<format>/<alphabet>
>  
> If it is only to make unique strings, then it's
> (almost) harmless.  But if the content of the field
> alters client processing, it would be a mistake.
> Meta-data should not be encoded in the IDs.  Client
> applications should treat the IDs as being
> structureless, and get the meta-data elsewhere.  There
> are a few reasons for this, but it comes down to
> assuring that the ID system remains stable and
> reliable.

These IDs aren't really identifying any single piece
of data -- they are metadata themselves, telling a client
about how to parse a stream.  Closest analogy might be
a MIME type.  Is this usage incompatible with the LSID
spec?  If so, we should maybe define format identifiers in
another namespace instead

     URN:format:open-bio.org:fasta?alphabet=DNA

> Is the intention to publish genbank and embl data
> under the "open-bio.org" authority?  This would be
> against the intention of LSIDs.  The third field in
> LSID syntax is the "authority" field, where authority
> is the owner, producer, manager of the data.  Genbank
> data should have an NCBI identifier in the authority
> field, etc.  The authority field is independent of
> data location.  It defeats the goal of having
> universally unique identifiers when different servers
> publish the same data using different authority
> fields.

No, I absolutely agree that genbank entries shouldn't
be published under the open-bio.org authority.  This
is just a tag which identifies the genbank (or whatever)
file format.  Arguably, that should be ``owned'' by NCBI,
too.  But it needs to live somewhere, otherwise applications
won't be able to handle this data without guessing types.
Ugh.

     Thomas.