[Open-bio-l] Re: [Bioperl-l] seq namespace method

Lincoln Stein lstein@cshl.org
Mon, 15 Jul 2002 16:48:00 -0400


The proposed IdentifierI should be a data quartet providing the following 
methods:

	authority()        # a domain name for LSID compatibility
	namespace()    # namespace within the authority
        object_id()        # the id
	version()		# object version number

It should provide a lsid_string() method that produces the following format:

	authority:namespace:object_id

and a common_name() method that produces:

	namespace:object_id.version

(this allows for the common notation SP:12345.6 that lots of people use).

On top of this, there is a need for a NameCollectionI interface that supports 
a collection of names attached to the object.  The operations it supports 
are:

	- assign multiple IdentifierI's to the same object, each with a distinct 
namespace
	- given an authority & namespace, retrieve the IdentifierI's that match.
	- support integrity checks that enforce cardinality rules for particular
		namespaces (e.g., no more than one SwissProt id allowed, but
		several Genbank Ids allowed).
	- support unique names across a set of objects

I have a module that does this, as well as handles name splits, merges and 
version updates.  It is implemented as a SOAP service.  Shall we talk about 
making it Bioperl compatible?

Lincoln


On Monday 15 July 2002 04:23 am, Steve Chervitz wrote:
> The namespace concept is useful, only I think that the correct place for it
> is at the level of the identifier, not the sequence object, because a
> namespace applies to a name, not a sequence.
>
> Bioperl doesn't encapsulate sequence identifiers with objects, but doing so
> would help manage the sequence-identifier relationship and also make it
> easier to work with identifiers in general.
>
> A Bio::Identifier object could have slots for: namespace, type, version,
> id, and  perhaps, is_unique. It could have a method to stringify itself
> with a specified delimiter and with or without namespace/version/type info.
>
> Identifiable objects such as sequences could have methods that returned
> Bio::Identifiers such as: all_identifiers(), preferred_identifier().
> Perhaps there could be a Bio::IdentifiableI interface for this.
>
> Identifiable object would not have to store Bio::Identifiers internally.
> They could construct them on the fly (perhaps via an associated
> Bio::Factory:: object).
>
> This would model the object-to-identifier relationship more generally than
> we do  now. For example, PrimarySeqs can have display_id, primary_id, and
> accession_number, which I always find a bit confusing/limiting.
>
> I know this is more of a substantial undertaking than just adding a
> namespace() method to PrimarySeqI, but could be worth the effort. What do
> you think?
>
> Steve
>
> --- Hilmar Lapp <hlapp@gnf.org> wrote:
> > According to BioSQL, sequences (bioentries) live in a namespace, e.g.,
> > the name of the databank that maintains and/or serves them.
> >
> > None of the Bio:: seq objects/interfaces have a method for that.
> >
> > I propose to add one, specifically to the lowest level Bio::PrimarySeqI
> > (bioentries are pretty general, and a namespace is needed for /any and
> > all/ bioentries). To me, the namespace doesn't have to do much with
> > whether this seq is going to be stored in BioSQL or not. A sequence with
> > an accession number has (implicitly or explicitly) a namespace in which
> > this accession number is valid. PrimarySeqI has an accession.
> >
> > Anyone has other suggestions, objections?
> >
> > 	-hilmar
> > --
> > -------------------------------------------------------------
> > Hilmar Lapp                            email: lapp at gnf.org
> > GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
> > -------------------------------------------------------------
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l@bioperl.org
> > http://bioperl.org/mailman/listinfo/bioperl-l
>
> =====
> Steve Chervitz
> sac@bioperl.org
>
> __________________________________________________
> Do You Yahoo!?
> Yahoo! Autos - Get free new car price quotes
> http://autos.yahoo.com
> _______________________________________________
> Open-Bio-l mailing list
> Open-Bio-l@open-bio.org
> http://open-bio.org/mailman/listinfo/open-bio-l

-- 
========================================================================
Lincoln D. Stein                           Cold Spring Harbor Laboratory
lstein@cshl.org			                  Cold Spring Harbor, NY
========================================================================