[Bioperl-l] Query for parsing BLAST with BioGraphics
Soumyadeep nandi
soumyadeep_nandi@yahoo.com
Tue, 17 Dec 2002 00:40:11 -0800 (PST)
Hi everybody,
I am trying to parse real BLAST and I am following the
script as bellow :-
##################################################################################
#! /usr/bin/perl
use strict;
use lib "$ENV{HOME}/Download/bioperl/bioperl-1.1.1";
use Bio::Graphics;
use Bio::SearchIO;
my $file = shift or die "usage render2.pl <blast
file>\n";
my $searchio = Bio::SearchIO->new(-file => $file,
-format => 'blast')
or die "perse failed";
my $result = $searchio->next_result() or die "no
result";
my $panel = Bio::Graphics::Panel->new(-length =>
$result->query_length,
-width => 800,
-pad_left => 10,
-pad_right =>
10,
);
my $full_length =Bio::SeqFeature::Generic->new(-start
=> 1,-end => $result->query_length,-seqname =>
$result->query_name);
$panel->add_track($full_length,
-glyph => 'arrow',
-tick => 2,
-fgcolor => 'black',
-double => 1,
-label => 1,
);
my $track = $panel->add_track(-glyph =>
'graded_segments',
-label => 1,
-connector => 'dashed',
-bgcolor => 'blue',
-font2color => 'red',
-sort_order =>
'high_score',
-description => sub {
my $feature = shift;
return unless
$feature->has_tag('description');
my ($description) =
$feature->each_tag_value('description');
my $score =
$feature->score;
"$description,
score=$score";
});
while(my $hit = $result->next_hit){
next unless $hit->significance < 1E-20;
my $feature =
Bio::SeqFeature::Generic->new(-score =>
$hit->raw_score,
-seqname => $hit->name,
-tag => {
description => $hit->description
},
);
while (my $hsp = $hit->next_hsp){
$feature->add_sub_SeqFeature($hsp,
'EXPAND');
}
$track->add_feature($feature);
}
print $panel->png;
##################################################################################
My BLAST output file is as follows :-
##################################################################################
BLASTN 2.2.3 [Apr-24-2002]
Reference: Altschul, Stephen F., Thomas L. Madden,
Alejandro A. Schaffer,
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J.
Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of
protein database search
programs", Nucleic Acids Res. 25:3389-3402.
Query= test query
(178 letters)
Database: /home/soumya/Application/BLAST/data/ecoli.nt
400 sequences; 4,662,239 total letters
Searching.done
Score E
Sequences producing significant alignments:
(bits) Value
gb|AE000180.1|AE000180 Escherichia coli K-12 MG1655
section 70 o... 28 2.4
gb|AE000161.1|AE000161 Escherichia coli K-12 MG1655
section 51 o... 28 2.4
gb|AE000454.1|AE000454 Escherichia coli K-12 MG1655
section 344 ... 26 9.4
gb|AE000418.1|AE000418 Escherichia coli K-12 MG1655
section 308 ... 26 9.4
gb|AE000241.1|AE000241 Escherichia coli K-12 MG1655
section 131 ... 26 9.4
gb|AE000111.1|AE000111 Escherichia coli K-12 MG1655
section 1 of... 26 9.4
>gb|AE000180.1|AE000180 Escherichia coli K-12 MG1655
section 70 of 400 of the complete genome
Length = 11022
Score = 28.2 bits (14), Expect = 2.4
Identities = 14/14 (100%)
Strand = Plus / Minus
Query: 94 tcatctgctcgcgt 107
||||||||||||||
Sbjct: 4297 tcatctgctcgcgt 4284
>gb|AE000161.1|AE000161 Escherichia coli K-12 MG1655
section 51 of 400 of the complete genome
Length = 16170
Score = 28.2 bits (14), Expect = 2.4
Identities = 14/14 (100%)
Strand = Plus / Plus
Query: 126 tagctacgatagct 139
||||||||||||||
Sbjct: 11520 tagctacgatagct 11533
>gb|AE000454.1|AE000454 Escherichia coli K-12 MG1655
section 344 of 400 of the complete genome
Length = 12175
Score = 26.3 bits (13), Expect = 9.4
Identities = 13/13 (100%)
Strand = Plus / Minus
Query: 153 catatccattagc 165
|||||||||||||
Sbjct: 10774 catatccattagc 10762
>gb|AE000418.1|AE000418 Escherichia coli K-12 MG1655
section 308 of 400 of the complete
genome
Length = 10776
Score = 26.3 bits (13), Expect = 9.4
Identities = 13/13 (100%)
Strand = Plus / Minus
Query: 74 tgatcagatgata 86
|||||||||||||
Sbjct: 611 tgatcagatgata 599
>gb|AE000241.1|AE000241 Escherichia coli K-12 MG1655
section 131 of 400 of the complete
genome
Length = 10160
Score = 26.3 bits (13), Expect = 9.4
Identities = 13/13 (100%)
Strand = Plus / Minus
Query: 96 atctgctcgcgta 108
|||||||||||||
Sbjct: 1288 atctgctcgcgta 1276
>gb|AE000111.1|AE000111 Escherichia coli K-12 MG1655
section 1 of 400 of the complete genome
Length = 10596
Score = 26.3 bits (13), Expect = 9.4
Identities = 13/13 (100%)
Strand = Plus / Minus
Query: 78 cagatgatattct 90
|||||||||||||
Sbjct: 1974 cagatgatattct 1962
Database:
/home/soumya/Application/BLAST/data/ecoli.nt
Posted date: Oct 21, 2002 3:48 PM
Number of letters in database: 4,662,239
Number of sequences in database: 400
Lambda K H
1.37 0.711 1.31
Gapped
Lambda K H
1.37 0.711 1.31
Matrix: blastn matrix:1 -3
Gap Penalties: Existence: 5, Extension: 2
Number of Hits to DB: 169
Number of Sequences: 400
Number of extensions: 169
Number of successful extensions: 6
Number of sequences better than 10.0: 6
length of query: 178
length of database: 4,662,239
effective HSP length: 15
effective length of query: 163
effective length of database: 4,656,239
effective search space: 758966957
effective search space used: 758966957
T: 0
A: 40
X1: 6 (11.9 bits)
X2: 15 (29.7 bits)
S1: 12 (24.3 bits)
S2: 13 (26.3 bits)
##################################################################################
But i am getting the output png file having only the
scale starting from 1-200 and not the hit track.
Pakages I am using: Stored in my computer:
bioperl-1.1.1
~/Download/bioperl/bioperl-1.1.1
gd-2.0.8
~/Download/bioperl/externel/gd-2.0.8
GD-2.041
~/Download/bioperl/externel/GD-2.041
freetype-2.1.2
~/Download/bioperl/externel/freetype-2.1.2
I would be highly gratefull to you if you can sugest
me what to do to get the hit tracks.
with regards
Soumyadeep
__________________________________________________
Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com