[Open-bio-l] LSIDs
Thomas Down
td2 at sanger.ac.uk
Fri Apr 4 09:46:37 EST 2003
On Fri, Apr 04, 2003 at 12:36:30AM -0800, Brian King wrote:
> > Comments are needed on this one!! Please read
> > below!
>
> What is the purpose of format/alphabet in the proposed
> format?
>
> URN:LSID:open-bio.org:<format>/<alphabet>
>
> If it is only to make unique strings, then it's
> (almost) harmless. But if the content of the field
> alters client processing, it would be a mistake.
> Meta-data should not be encoded in the IDs. Client
> applications should treat the IDs as being
> structureless, and get the meta-data elsewhere. There
> are a few reasons for this, but it comes down to
> assuring that the ID system remains stable and
> reliable.
These IDs aren't really identifying any single piece
of data -- they are metadata themselves, telling a client
about how to parse a stream. Closest analogy might be
a MIME type. Is this usage incompatible with the LSID
spec? If so, we should maybe define format identifiers in
another namespace instead
URN:format:open-bio.org:fasta?alphabet=DNA
> Is the intention to publish genbank and embl data
> under the "open-bio.org" authority? This would be
> against the intention of LSIDs. The third field in
> LSID syntax is the "authority" field, where authority
> is the owner, producer, manager of the data. Genbank
> data should have an NCBI identifier in the authority
> field, etc. The authority field is independent of
> data location. It defeats the goal of having
> universally unique identifiers when different servers
> publish the same data using different authority
> fields.
No, I absolutely agree that genbank entries shouldn't
be published under the open-bio.org authority. This
is just a tag which identifies the genbank (or whatever)
file format. Arguably, that should be ``owned'' by NCBI,
too. But it needs to live somewhere, otherwise applications
won't be able to handle this data without guessing types.
Ugh.
Thomas.
More information about the Open-Bio-l
mailing list