[Bioperl-l] GI identifier missing when using Bio::Index::GenBank?
Todd Richmond
richmond.todd at gmail.com
Thu Apr 27 01:47:52 UTC 2006
I've got an application where I grab the daily updates from NCBI, pull
out just the plant sequences and store them in a separate flat file.
Then I use Bio::Index::GenBank to index the plant flat file so I can
pull out my sequences of interest. I'm in the midst of converting my
scripts to using bioperl-db/biosql so I can push those sequences into
the database. The problem is that the NCBI GI identifier isn't
returned when using the index file.
When I run the following test script:
***
use Bio::Index::GenBank;
use Bio::SeqIO;
use strict;
my $Index_File_Name = 'nc0425.idx';
my $inx = Bio::Index::GenBank->new('-filename' => $Index_File_Name);
my $seqio = new Bio::SeqIO( '-format' => 'genbank' );
my $seq = $inx->get_Seq_by_acc('CJ521890');
$seqio->write_seq($seq);
***
Diffing to the original GenBank record, the only difference is the GI
identifier:
diff CJ521890_orig.out CJ521890_seqio.out
5c5
< VERSION CJ521890.1 GI:93266243
---
> VERSION CJ521890.1
Is this expected behaviour? If so, is there a workaround that will
allow me to retrieve the GI from the index file so I can store it in
the bioentry table?
Thanks, Todd
--
Todd Richmond
richmond.todd at gmail.com
More information about the Bioperl-l
mailing list