[Bioperl-l] [Quick help needed] Getting Organism info using NCBI Accession numbers : sample code included
Abhishek Pratap
abhishek.vit at gmail.com
Fri Apr 15 22:39:38 UTC 2011
Hi Guys
Sorry I am posting the same question again from an old thread. I hope this
time the subject line is more relevant to the question.
I have list of NCBI Accession/locus name and not GI numbers. What I need to
do is to obtain lineage for each NCBI accession.
Is this functionality built in directly ? I am using eftech to get the
genbank record but not sure how to specifically pull out the organism
lineage. Also I would want this to be fast as I will have thousands of such
accessions to query.
Eg:
I want to seach NCBI for Locus name "CP000490" and get the organism lineage
?
Bacteria; Proteobacteria; Alphaproteobacteria; Rhodobacterales;
Rhodobacteraceae; Paracoccus.
This info is present in the gen bank record but I am not sure whats the best
way to fetch it specifically.
http://www.ncbi.nlm.nih.gov/nuccore/CP000490
Sample code :
my @ids = qw( NW_001884661 EZ361133 CP000490 ) ;
my $factory = Bio::DB::EUtilities->new(-eutil => 'efetch',
-email => 'apratap at lbl.gov',
-db => 'nucleotide',
-id => \@ids,
);
my $file = 'temp.gb';
$factory->get_Response(-file => $file);
my $seqin = Bio::SeqIO->new(-file => $file,
-format => 'genbank');
Thanks for your help!
-Abhi
More information about the Bioperl-l
mailing list