[Bioperl-l] Unigene proposal and basic implementation

Ewan Birney birney@ebi.ac.uk
Mon, 15 Apr 2002 09:30:28 +0100 (BST)


On Mon, 15 Apr 2002, Andrew Macgregor wrote:

> Recently I emailed Elia regarding the state of unigene in bioperl as I found
> a few posts from about a year ago in the mailing list archive. He said that
> not much had happened on that front and that if I was working on something I
> should post a proposal.
> 
> I have been working on coding a unigene parser and seeing whether I could
> make it fit into bioperl in any way. I have scripts that do everything I
> need outside of bioperl but would like to contribute. I'm new to bioperl and
> this is a first foray in to OO-perl, but you gotta start somewhere right?!
> I've worked on producing what I would need, but tried to follow the
> structure of bioperl.
> 
> So I've coded a basic implementation of UniGene.pm and UniGeneIO.pm based on
> Seq.pm and SeqIO.pm. I've also coded a unigene format module based on those
> used by SeqIO. They work roughly like this.
> 
> - UnigeneIO reads from a NCBI unigene file using the unigene format module
> and returns a unigene object for each unigene record.
> - Each unigene object has methods to return info like unigene_id, title,
> gene, locuslink etc
> - Each unigene object has methods to return the associated sequence,
> protsims, express tissues etc either one by one or as an array.

Sounds great. At the moment we don't really have an abstraction for
"cluster of sequences which form a transcript" which is what unigene is,
but I suspect we will need one at some point.


In your modules listed above do they

   (a) go into the Bio:: namespace (Bio::Unigene etc?)

   (b) inheriet from Bio::Root::Root.pm?

   (c) use Bioperl conventions of new functions looking like
     
       $uni_io = Bio::UnigeneIO->new( -file => 'somefile');


If so --- we should think about looking at them more directly and then
probably getting you a cvs account.


> 
> So basically a unigene object is a container specific to unigene as far as I
> can see. It could I guess have a more abstract container above it. It could
> be made to return each sequence as a seq object.
> 
> What I am now wondering is where to from here? Is there interest in using
> this? Is this on the right track? etc etc. I'm happy to contribute this if
> it will be useful, and look after it. I'll post the code, rough though it is
> if there is interest.
> 
> Cheers, Andrew.
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
> 

-----------------------------------------------------------------
Ewan Birney. Mobile: +44 (0)7970 151230, Work: +44 1223 494420
<birney@ebi.ac.uk>. 
-----------------------------------------------------------------