[Bioperl-l] [Quick help needed] Getting Organism info using NCBI Accession numbers : sample code included

Abhishek Pratap abhishek.vit at gmail.com
Fri Apr 15 22:39:38 UTC 2011


Hi Guys

Sorry I am posting the same question again from an old thread. I hope this
time the subject line is more relevant to the question.

I have list of  NCBI Accession/locus name and not GI numbers. What I need to
do is to obtain lineage for each NCBI accession.

Is this functionality built in directly ? I am using eftech to get the
genbank record but not sure how to specifically pull out the organism
lineage. Also I would want this to be fast as I will have thousands of such
accessions to query.

Eg:

I want to seach NCBI for Locus name "CP000490" and get the organism lineage
?


 Bacteria; Proteobacteria; Alphaproteobacteria; Rhodobacterales;
            Rhodobacteraceae; Paracoccus.


This info is present in the gen bank record but I am not sure whats the best
way to fetch it specifically.
http://www.ncbi.nlm.nih.gov/nuccore/CP000490

Sample code :

my @ids = qw( NW_001884661 EZ361133 CP000490 ) ;

my $factory = Bio::DB::EUtilities->new(-eutil => 'efetch',
                                       -email => 'apratap at lbl.gov',
                                       -db    => 'nucleotide',
                                       -id    => \@ids,



                                        );

my $file = 'temp.gb';

$factory->get_Response(-file => $file);

my $seqin = Bio::SeqIO->new(-file => $file,
 -format => 'genbank');



Thanks for your help!
-Abhi



More information about the Bioperl-l mailing list