[Bioperl-l] get_sequence - acc does not exist

Paul G Cantalupo lupey+ at pitt.edu
Tue Aug 30 20:34:39 EDT 2005


Hello,

I discovered that Bio::Perl get_sequence does not handle Genbank GI 
numbers properly due to the following code in get_sequence:

    if( $identifier =~ /^\w+\d+$/ ) {
        $seq = $db->get_Seq_by_acc($identifier);
    } else {
        $seq = $db->get_Seq_by_id($identifier);
    }

Genbank GI numbers (i.e. 51527264) match the regular expression /^\w+\d+$/ 
therefore unsuprisingly the method get_Seq_by_acc fails (with a warning 
like: MSG: acc (gb|51527264) does not exist). Instead, the method 
get_Seq_by_id works when called with GI numbers:


   use Bio::DB::GenBank;
   my $genbank_db = Bio::DB::GenBank->new();
   $seq = $genbank_db->get_Seq_by_id(51527264);
   print $seq->desc;

Shouldn't the regular expression in get_sequence be changed to look for 
identifiers that are all digits and then call get_Seq_by_id? Or am I not 
understanding something?

Thank you,

Paul

Paul Cantalupo
Research Specialist/Systems Programmer
559 Crawford Hall
Department of Biological Sciences
University of Pittsburgh
Pittsburgh, PA 15260
Work: 412-624-4687
Fax: 412-624-4759

Ask me about Toastmasters: www.toastmasters.org
Midday Club Treasurer


More information about the Bioperl-l mailing list