[Bioperl-l] Unique IDs in bioperl (Was: new GFF3 support methods added)

Allen Day allenday at ucla.edu
Tue Mar 9 04:04:52 EST 2004


I'd be interested in having this discussion and/or method documented as a
set of guidelines or rules for creating unique IDs.  We've had to deal
with a similar problem to generate unique IDs for our chado GFF3 loader
and UCSC datafile parsers and the solutions are essentially bad hacks.

-allen

> Much as I like closures, I think there should be a standard sanctioned way
> of generating unique/persistent IDs
> 
> > > Unique IDs in bioperl:
> > >
> > > In the discussion that preceeded this, it seemed that people liked the
> > > idea of persistent unique IDs, but there was no suggestions as to how
> > > to
> > > go about it. This is inherently difficult with objects, but I borrowed
> > > a
> > > solution from relational modeling.
> > >
> > > A persistent unique ID is generated using
> > >
> > >   seq_id
> > >   primary_tag
> > >   start
> > >   end
> > >
> > > It is assumed that these are all set and comprise a "unique key" over
> > > features.
> >
> > Hmm. Wouldn't you need to include source_tag()? (Source_tag is part of
> > the unique key in biosql.) Without the source_tag being part of this,
> > wouldn't that mean you cannot have the exact (start+end) same segment
> > predicted as exon by different methods and have those different
> > predictions co-exist as separate features in the graph? (Presumably
> > those would only differ in source_tag)
> 
> Good point! I've added this
> 
> Ok, so are you saying that some of these methods don't really belong in
> the class-interfaces they are in?
> 
> Perhaps it would be better to have a Bio::SeqFeature::Tools::IDHandler
> class? This would contain methods
> 
>   generate_unique_persistent_id($feat) # uses $feat->seq_id
>   generate_unique_persistent_id($feat, $seq_id)
> 
>   create_hierarchy_from_ParentIDs($featholder);
>   set_ParentIDs_from_hierarchy($featholder);
> 
> Any preference? I'm leaving them as is for now if that's ok, but I have no
> objections to moving everything to a seperate class if that's prefered.



More information about the Bioperl-l mailing list