[Bioperl-l] Query for parsing BLAST with BioGraphics

Soumyadeep nandi soumyadeep_nandi@yahoo.com
Tue, 17 Dec 2002 00:40:11 -0800 (PST)


Hi everybody,

I am trying to parse real BLAST and I am following the
script as bellow :-

##################################################################################
#! /usr/bin/perl

use strict;
use lib "$ENV{HOME}/Download/bioperl/bioperl-1.1.1";
use Bio::Graphics;
use Bio::SearchIO;

my $file = shift or die "usage render2.pl <blast
file>\n";

my $searchio = Bio::SearchIO->new(-file => $file,
                                  -format => 'blast')
or die "perse failed";

my $result = $searchio->next_result() or die "no
result";

my $panel = Bio::Graphics::Panel->new(-length =>
$result->query_length,
                                      -width => 800,
                                      -pad_left => 10,
                                      -pad_right =>
10,
                                      );

my $full_length =Bio::SeqFeature::Generic->new(-start
=> 1,-end => $result->query_length,-seqname =>
$result->query_name);

$panel->add_track($full_length,
                  -glyph => 'arrow',
                  -tick => 2, 
                  -fgcolor => 'black',
                  -double => 1,
                  -label => 1,
                  );

my $track = $panel->add_track(-glyph =>
'graded_segments',
                              -label => 1,
                              -connector => 'dashed',
                              -bgcolor => 'blue',     
                         -font2color => 'red',
                              -sort_order =>
'high_score',
                              -description => sub {
                              my $feature = shift;
                              return unless
$feature->has_tag('description');
                              my ($description) =
$feature->each_tag_value('description');
                              my $score =
$feature->score;
                              "$description,
score=$score";
                              });

while(my $hit = $result->next_hit){
        next unless $hit->significance < 1E-20;
        my $feature =
Bio::SeqFeature::Generic->new(-score =>
$hit->raw_score,
-seqname => $hit->name,
-tag => {
description => $hit->description
},
);

while (my $hsp = $hit->next_hsp){
                $feature->add_sub_SeqFeature($hsp, 	
'EXPAND');
        }

        $track->add_feature($feature);
}

print $panel->png;
##################################################################################


My BLAST output file is as follows :-

##################################################################################

BLASTN 2.2.3 [Apr-24-2002]


Reference: Altschul, Stephen F., Thomas L. Madden,
Alejandro A. Schaffer,
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J.
Lipman (1997), 
"Gapped BLAST and PSI-BLAST: a new generation of
protein database search
programs",  Nucleic Acids Res. 25:3389-3402.

Query= test query
         (178 letters)

Database: /home/soumya/Application/BLAST/data/ecoli.nt

           400 sequences; 4,662,239 total letters

Searching.done

                                                      
          Score    E
Sequences producing significant alignments:           
          (bits) Value

gb|AE000180.1|AE000180 Escherichia coli K-12 MG1655
section 70 o...    28   2.4  
gb|AE000161.1|AE000161 Escherichia coli K-12 MG1655
section 51 o...    28   2.4  
gb|AE000454.1|AE000454 Escherichia coli K-12 MG1655
section 344 ...    26   9.4  
gb|AE000418.1|AE000418 Escherichia coli K-12 MG1655
section 308 ...    26   9.4  
gb|AE000241.1|AE000241 Escherichia coli K-12 MG1655
section 131 ...    26   9.4
gb|AE000111.1|AE000111 Escherichia coli K-12 MG1655
section 1 of...    26   9.4  

>gb|AE000180.1|AE000180 Escherichia coli K-12 MG1655
section 70 of 400 of the complete genome
          Length = 11022

 Score = 28.2 bits (14), Expect = 2.4
 Identities = 14/14 (100%)
 Strand = Plus / Minus

                          
Query: 94   tcatctgctcgcgt 107
            ||||||||||||||
Sbjct: 4297 tcatctgctcgcgt 4284


>gb|AE000161.1|AE000161 Escherichia coli K-12 MG1655
section 51 of 400 of the complete genome
          Length = 16170

 Score = 28.2 bits (14), Expect = 2.4
 Identities = 14/14 (100%)
 Strand = Plus / Plus


Query: 126   tagctacgatagct 139
             ||||||||||||||
Sbjct: 11520 tagctacgatagct 11533


>gb|AE000454.1|AE000454 Escherichia coli K-12 MG1655
section 344 of 400 of the complete genome
          Length = 12175

 Score = 26.3 bits (13), Expect = 9.4
 Identities = 13/13 (100%)
 Strand = Plus / Minus

                          
Query: 153   catatccattagc 165
             |||||||||||||
Sbjct: 10774 catatccattagc 10762


>gb|AE000418.1|AE000418 Escherichia coli K-12 MG1655
section 308 of 400 of the complete
           genome
          Length = 10776

 Score = 26.3 bits (13), Expect = 9.4
 Identities = 13/13 (100%)
 Strand = Plus / Minus

                        
Query: 74  tgatcagatgata 86
           |||||||||||||
Sbjct: 611 tgatcagatgata 599


>gb|AE000241.1|AE000241 Escherichia coli K-12 MG1655
section 131 of 400 of the complete
            genome
          Length = 10160

 Score = 26.3 bits (13), Expect = 9.4
 Identities = 13/13 (100%)
 Strand = Plus / Minus

                         
Query: 96   atctgctcgcgta 108
            |||||||||||||
Sbjct: 1288 atctgctcgcgta 1276


>gb|AE000111.1|AE000111 Escherichia coli K-12 MG1655
section 1 of 400 of the complete genome
          Length = 10596

 Score = 26.3 bits (13), Expect = 9.4
 Identities = 13/13 (100%)
 Strand = Plus / Minus

                         
Query: 78   cagatgatattct 90
            |||||||||||||
Sbjct: 1974 cagatgatattct 1962


  Database:
/home/soumya/Application/BLAST/data/ecoli.nt
    Posted date:  Oct 21, 2002  3:48 PM
  Number of letters in database: 4,662,239
  Number of sequences in database:  400
  
Lambda     K      H
    1.37    0.711     1.31 

Gapped
Lambda     K      H
    1.37    0.711     1.31 


Matrix: blastn matrix:1 -3
Gap Penalties: Existence: 5, Extension: 2
Number of Hits to DB: 169
Number of Sequences: 400
Number of extensions: 169
Number of successful extensions: 6
Number of sequences better than 10.0: 6
length of query: 178
length of database: 4,662,239
effective HSP length: 15
effective length of query: 163
effective length of database: 4,656,239
effective search space: 758966957
effective search space used: 758966957
T: 0
A: 40
X1: 6 (11.9 bits)
X2: 15 (29.7 bits)
S1: 12 (24.3 bits)
S2: 13 (26.3 bits)

##################################################################################

But i am getting the output png file having only the
scale starting from 1-200 and not the hit track.

Pakages I am using:     Stored in my computer:
bioperl-1.1.1          
~/Download/bioperl/bioperl-1.1.1
gd-2.0.8               
~/Download/bioperl/externel/gd-2.0.8
GD-2.041               
~/Download/bioperl/externel/GD-2.041
freetype-2.1.2  
~/Download/bioperl/externel/freetype-2.1.2

I would be highly gratefull to you if you can sugest
me what to do to get the hit tracks.

with regards
Soumyadeep

__________________________________________________
Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com