[Bioperl-l] Re: How to retrieve the unigene number

Jason Stajich jason@cgt.mc.duke.edu
Fri, 9 Aug 2002 15:35:15 -0400 (EDT)


This is new code, we only issue a release when we think we've testing
things and they've had a chance to be mature.  You can live on the edge
using CVS (a lot of us do, that is how bugs get fixed before code is
released so I encourage you to help out that way).

You can get the live code from CVS see http://cvs.bioperl.org

1.1 developer release should be out shortly which will be essentially a
tarball of the current code on the live branch.

-jason

On Fri, 9 Aug 2002, P B wrote:

> Hi all,
>
> This is post from May 9th about how to get a UniGene cluster ID given the
> genbank accession.  The answer suggests using the ClusterIO and Unigene
> modules to do the parsing step.  I also read the threads a while back about
> the development of these modules, and it sounds exactly like what I need.
>
> So... where do I find them?! :)
>
> They don't seem to have come with my 1.02 distribution, nor are they
> included in the documentation for that release.  Is there a place where I
> could get my hands on the code?
>
> It's probably in an obvious place I didn't look, eh?
> Tats
>
> --------------------------------------------------------------------------------
>
> Giuseppe Torelli wrote:
>
> >I'm a newbie regarding Bio Perl and also Perl. I've followed the discussion
> >about the unigene module;
> >would you please tell me how to retrieve the unigene number of a gene
> >knowing the GenBank
> >accession number ?
>
> Hi Giuseppe,
>
> I'm sure there is "more than one way to do it" but here are a couple. Which
> you use depends on how often you want to look up a unigene from an accession
> number.
>
> 1. If it is just once or twice, I would just visit the UniGene website,
> select the appropriate organism, type in the acc number and hit search.
>
> 2. If you wanted to do lots of this I would:
> - download the appropriate organism file from UniGene ie. Hs.data
> -  use the ClusterIO and Unigene modules to parse the file (this takes some
> time, I leave mine overnight)
> - drop the resulting data into a SQL db
> - search on acc number to get back to the UniGene no
> - the ClusterIO/Unigene modules can parse right into the seq lines so you
> can pull out all the acc numbers for each unigene.
> - you could then write a perl script that did the retrieval for all the acc
> numbers you are interested in (especially say if there are thousands of
> them).
>
> I hope this helps...
>
> Cheers, Andrew.
>
> _________________________________________________________________
> Send and receive Hotmail on your mobile device: http://mobile.msn.com
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>

-- 
Jason Stajich
Duke University
jason at cgt.mc.duke.edu