[Bioperl-l] Parsing Blast reports: Length of sequence of a hit ?

Edith Schlagenhauf ediths at unizh.ch
Thu Oct 28 09:44:30 EDT 2004


Hi,

1) with the deprecated Bio::Tools::Blast module one could get
the total length of the hit sequence as given in the Genbank
(and in the Blast report) file by using the length() method
($hit->length()).

Is there any equivalent functionality in Bio::SearchIO ?


2) I use the GenBank.pm module to get from hit accession as
given in a Blast report to gi number,
ie:

-------------------------------------------------------
use strict;
use Bio::DB::GenBank;

my $gb_hit_accession = "AF091802";
my $ref_hit_accession = "XM_480600";

# get PID for NCBI ENTREZ

my $seq_obj = $gb->get_Seq_by_acc($ref_hit_accession);
my $primary_id = $seq_obj->primary_id();

print STDOUT "\$primary_id is: $primary_id\n";
-------------------------------------------------------


a) RefSeq sequences exit with :

-------------------- WARNING ---------------------
MSG: acc (gb|XM_480600) does not exist
---------------------------------------------------
Can't call method "primary_id" on an undefined value at ./gbTest.pl line
23.

the reason being that ref| is replaced with gb|.

when I changed the following line of code in GenBank.pm

sub get_Seq_by_acc {
   my ($self,$seqid) = @_;
   $self->SUPER::get_Seq_by_acc("gb|$seqid");
}

to :

sub get_Seq_by_acc {
   my ($self,$seqid) = @_;
   $self->SUPER::get_Seq_by_acc("$seqid");
}

ie, omitting the "gb|" string, the script proceeded for all seqs
(also for gb| seqs) without problems.
Thus, what for is this "gb|" added?


b) is there a more convenient way to get gi numbers from accession
numbers using Bioperl?



Thanks for your input,
Edith



******************************************
Dr Edith Schlagenhauf
Bioinformatics
Institute of Plant Biology
University of Zurich
Zollikerstrasse 107
CH-8008 Zurich
SWITZERLAND

e-mail: ediths AT botinst DOT unizh DOT ch
Tel.:	+41 1 634 82 78
Fax :	+41 1 634 82 04
******************************************


More information about the Bioperl-l mailing list