[Bioperl-l] retrieve refseq ids from UIDs

Carnë Draug carandraug+dev at gmail.com
Tue Jun 28 11:41:06 UTC 2011


On 28 June 2011 04:20, Smithies, Russell
<Russell.Smithies at agresearch.co.nz> wrote:
> I assume you've had a look at the cookbook http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook
> Also take a look at elink, it might do what you are after http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook#I_want_a_list_of_database_.27x.27_UIDs_that_are_linked_from_a_list_of_database_.27y.27_UIDs
> The Scrapbook is a good place to get ideas as well http://www.bioperl.org/wiki/Category:Scrapbook

Hi Russel,

thank you for your answer. I had indeed looking at the cookbook. I'd
never tried elink and it works sometimes. I have a couple of problems
with it tough.

Basically, using that approach, I have to get the UID from gene, and
use elink to get the transcripts by searching what links to
'nucleotide' (with link name gene_nuccore_refseqrna). Then, I have to
search to where each of them links to the protein db. Also, since if I
use an array of uids to search, I get all the UIDS that links in one
list, I have to use a single UID so I know from where each comes. This
is true for searching what nucleotides come from gene and what
proteins come from nucleotide. This implies a lot of connections and
it may be why sometimes I get the warning

--------------------- WARNING ---------------------
MSG: No linksets returned
---------------------------------------------------

Does NCBI have some sort of mechanism to avoid flooding with requests?
Here's the code I used http://pastebin.com/DsCh2JuL

Also, the several connections makes it slower. There must be a simpler
way since one of the pieces of code I showed on the first mail

my @ids = qw(9555);
my $factory = Bio::DB::EUtilities->new(-eutil   => 'efetch',
                                      -db      => 'gene',
                                      -id      => \@ids,
                                      );
say $factory->get_Response->content;

does retrieve a weird structure with all that info. Isn't there a
method to access this data properly? Or maybe use some other module?
Thanks,
Carnë




More information about the Bioperl-l mailing list