[Bioperl-l] Parsing Hit/Query Frames from Blastx

davila davila at ioc.fiocruz.br
Fri Nov 26 13:52:50 EST 2004


Hi, 

Trying to parse a Blasxt output file, I realized it is not catching the real values of Hit_Frame and Query_Frame such as showed in the Bioperl Howtos:

http://bioperl.org/HOWTOs/SearchIO/use.html

HSP  	frame  	0  	$hsp->query->frame,$hsp->hit->frame

My code (listed below) is returning wrong (Hit and Query) Frame values, maybe I am doing something wrong. Any help would be greatly appreciated.

Thanks, Alberto
 
*******

Code:

use lib "/usr/local/bioperl14";
use Bio::SearchIO;
 
 
  $searchio = new Bio::SearchIO ('-format' => 'blast',
                                 '-file'   => 'clusters.blast');
   
  while ($result = $searchio->next_result) {
     $query_name = $result->query_name();
     $cluster_id = $query_name;
print "$cluster_id\n";
     $rank = 1;
     while ($hit = $result->next_hit) {
       ($gi) = $hit->name =~ /gi\|(\d+)\|/;
       $hsp = $hit->next_hsp;
       $hit_length=$hit->length;
#        $query_frame = $hsp->query->frame,$hsp->hit->frame;
       $query_frame = $hsp->query->frame;
print "$query_frame\n";
       $hit_frame = $hsp->hit->frame;
print "$hit_frame\n";
       $hsp_query_string = $hsp->query_string;
#print "$hsp_query_string\n\n";
       $hsp_homology_string = $hsp->homology_string;
#print "$hsp_homology_string\n\n";
       $hsp_hit_string = $hsp->hit_string;
#print "$hsp_hit_string\n\n";
        $hsp_frac_identical =$hsp->frac_identical*100;
#print "$hsp_frac_identical%\n\n";
        $hsp_frac_conserved= $hsp->frac_conserved*100;
#print "$hsp_frac_conserved%\n\n";
$hsp_align="$hsp_query_string\n$hsp_homology_string\n$hsp_hit_string";
 
print "$hsp_align\n\n\n\n";
                                }
                }

Results:

[root at genome blast]# perl align-teste1.pl
Name "main::gi" used only once: possible typo at align-teste1.pl line 17.
Name "main::hsp_frac_conserved" used only once: possible typo at align-teste1.pl line 33.
Name "main::hit_length" used only once: possible typo at align-teste1.pl line 19.
Name "main::hsp_frac_identical" used only once: possible typo at align-teste1.pl line 31.
Name "main::rank" used only once: possible typo at align-teste1.pl line 15.
333
334
335
336
337
1 (should be +2)
0 (should be -1)
YLTPTPIEPHL
Y+TPTPIEPHL
YITPTPIEPHL
 
 
 
338
339
340
341
342
343
0 (should be +1)
0 (should be +1)
IHCEELKQLGRASEKCVL*LFNYSLDTGQVPAKWRHGIIVPQLKPNKSANSMASFRPAPKHSKLNRLGVPLLA
++ E L+ LG  +   VL LFN SL TG VP  W+ G+I+P LK  K A  + S+RP    S L ++   ++A
LYNEALQHLGITALNVVLRLFNESLRTGVVPPAWKTGVIIPILKAGKKAEDLDSYRPVTLTSCLCKVMERIIA
 
 



More information about the Bioperl-l mailing list