[Bioperl-l] get_sequence - acc does not exist
Paul G Cantalupo
lupey+ at pitt.edu
Tue Aug 30 20:34:39 EDT 2005
Hello,
I discovered that Bio::Perl get_sequence does not handle Genbank GI
numbers properly due to the following code in get_sequence:
if( $identifier =~ /^\w+\d+$/ ) {
$seq = $db->get_Seq_by_acc($identifier);
} else {
$seq = $db->get_Seq_by_id($identifier);
}
Genbank GI numbers (i.e. 51527264) match the regular expression /^\w+\d+$/
therefore unsuprisingly the method get_Seq_by_acc fails (with a warning
like: MSG: acc (gb|51527264) does not exist). Instead, the method
get_Seq_by_id works when called with GI numbers:
use Bio::DB::GenBank;
my $genbank_db = Bio::DB::GenBank->new();
$seq = $genbank_db->get_Seq_by_id(51527264);
print $seq->desc;
Shouldn't the regular expression in get_sequence be changed to look for
identifiers that are all digits and then call get_Seq_by_id? Or am I not
understanding something?
Thank you,
Paul
Paul Cantalupo
Research Specialist/Systems Programmer
559 Crawford Hall
Department of Biological Sciences
University of Pittsburgh
Pittsburgh, PA 15260
Work: 412-624-4687
Fax: 412-624-4759
Ask me about Toastmasters: www.toastmasters.org
Midday Club Treasurer
More information about the Bioperl-l
mailing list