[Open-bio-l] Re: [Bioperl-l] seq namespace method
Lincoln Stein
lstein@cshl.org
Mon, 15 Jul 2002 16:48:00 -0400
The proposed IdentifierI should be a data quartet providing the following
methods:
authority() # a domain name for LSID compatibility
namespace() # namespace within the authority
object_id() # the id
version() # object version number
It should provide a lsid_string() method that produces the following format:
authority:namespace:object_id
and a common_name() method that produces:
namespace:object_id.version
(this allows for the common notation SP:12345.6 that lots of people use).
On top of this, there is a need for a NameCollectionI interface that supports
a collection of names attached to the object. The operations it supports
are:
- assign multiple IdentifierI's to the same object, each with a distinct
namespace
- given an authority & namespace, retrieve the IdentifierI's that match.
- support integrity checks that enforce cardinality rules for particular
namespaces (e.g., no more than one SwissProt id allowed, but
several Genbank Ids allowed).
- support unique names across a set of objects
I have a module that does this, as well as handles name splits, merges and
version updates. It is implemented as a SOAP service. Shall we talk about
making it Bioperl compatible?
Lincoln
On Monday 15 July 2002 04:23 am, Steve Chervitz wrote:
> The namespace concept is useful, only I think that the correct place for it
> is at the level of the identifier, not the sequence object, because a
> namespace applies to a name, not a sequence.
>
> Bioperl doesn't encapsulate sequence identifiers with objects, but doing so
> would help manage the sequence-identifier relationship and also make it
> easier to work with identifiers in general.
>
> A Bio::Identifier object could have slots for: namespace, type, version,
> id, and perhaps, is_unique. It could have a method to stringify itself
> with a specified delimiter and with or without namespace/version/type info.
>
> Identifiable objects such as sequences could have methods that returned
> Bio::Identifiers such as: all_identifiers(), preferred_identifier().
> Perhaps there could be a Bio::IdentifiableI interface for this.
>
> Identifiable object would not have to store Bio::Identifiers internally.
> They could construct them on the fly (perhaps via an associated
> Bio::Factory:: object).
>
> This would model the object-to-identifier relationship more generally than
> we do now. For example, PrimarySeqs can have display_id, primary_id, and
> accession_number, which I always find a bit confusing/limiting.
>
> I know this is more of a substantial undertaking than just adding a
> namespace() method to PrimarySeqI, but could be worth the effort. What do
> you think?
>
> Steve
>
> --- Hilmar Lapp <hlapp@gnf.org> wrote:
> > According to BioSQL, sequences (bioentries) live in a namespace, e.g.,
> > the name of the databank that maintains and/or serves them.
> >
> > None of the Bio:: seq objects/interfaces have a method for that.
> >
> > I propose to add one, specifically to the lowest level Bio::PrimarySeqI
> > (bioentries are pretty general, and a namespace is needed for /any and
> > all/ bioentries). To me, the namespace doesn't have to do much with
> > whether this seq is going to be stored in BioSQL or not. A sequence with
> > an accession number has (implicitly or explicitly) a namespace in which
> > this accession number is valid. PrimarySeqI has an accession.
> >
> > Anyone has other suggestions, objections?
> >
> > -hilmar
> > --
> > -------------------------------------------------------------
> > Hilmar Lapp email: lapp at gnf.org
> > GNF, San Diego, Ca. 92121 phone: +1-858-812-1757
> > -------------------------------------------------------------
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l@bioperl.org
> > http://bioperl.org/mailman/listinfo/bioperl-l
>
> =====
> Steve Chervitz
> sac@bioperl.org
>
> __________________________________________________
> Do You Yahoo!?
> Yahoo! Autos - Get free new car price quotes
> http://autos.yahoo.com
> _______________________________________________
> Open-Bio-l mailing list
> Open-Bio-l@open-bio.org
> http://open-bio.org/mailman/listinfo/open-bio-l
--
========================================================================
Lincoln D. Stein Cold Spring Harbor Laboratory
lstein@cshl.org Cold Spring Harbor, NY
========================================================================