[Bioperl-l] parsing an html blast result file
Wes Barris
wes.barris at csiro.au
Wed Jul 23 17:14:11 EDT 2003
Hi,
I have installed bioperl etal. on a Sun (Solaris8). I would now like
to try parsing an html blast results file. I saved example 4 from this
page into a file:
http://www.bioperl.org/HOWTOs/html/Graphics-HOWTO.html
The only thing I changed in the file is the format of the input file
from this:
-format => 'blast') or die "parse failed";
to this:
-format => 'blastxml') or die "parse failed";
I am assuming that the format of an html blast result file is "blastxml",
but I could be wrong. I could not find a list of valid formats that can
be used with the Bio::SearchIO->new constructor.
When I run the example 4 script, I get this error:
wes at sequence> blasttoimg.pl junk.html >junk.png
-------------------- WARNING ---------------------
MSG: error in parsing a report:
not well-formed (invalid token) at line 9, column 34, byte 238 at
/usr/local/lib/perl5/site_perl/5.6.1/sun4-solaris/XML/Parser.pm line 185
---------------------------------------------------
no result at /home/wes/proj/blast/blasttoimg.pl line 15, <GEN1> line 669.
Could anyone suggest what I might try to make this work?
#!/usr/local/bin/perl
# This is code example 4 in the Graphics-HOWTO
use strict;
#use lib "$ENV{HOME}/projects/bioperl-live";
use Bio::Graphics;
use Bio::SearchIO;
my $file = shift or die "Usage: render4.pl <blast file>\n";
my $searchio = Bio::SearchIO->new(-file => $file,
-format => 'blastxml') or die "parse failed";
my $result = $searchio->next_result() or die "no result";
my $panel = Bio::Graphics::Panel->new(-length => $result->query_length,
-width => 800,
-pad_left => 10,
-pad_right => 10,
);
my $full_length = Bio::SeqFeature::Generic->new(-start=>1,-end=>$result->query_length,
-seq_id=>$result->query_name);
$panel->add_track($full_length,
-glyph => 'arrow',
-tick => 2,
-fgcolor => 'black',
-double => 1,
-label => 1,
);
my $track = $panel->add_track(-glyph => 'graded_segments',
-label => 1,
-connector => 'dashed',
-bgcolor => 'blue',
-font2color => 'red',
-sort_order => 'high_score',
-description => sub {
my $feature = shift;
return unless $feature->has_tag('description');
my ($description) = $feature->each_tag_value('description');
my $score = $feature->score;
"$description, score=$score";
});
while( my $hit = $result->next_hit ) {
next unless $hit->significance < 1E-20;
my $feature = Bio::SeqFeature::Generic->new(-score => $hit->raw_score,
-seq_id => $hit->name,
-tag => {
description => $hit->description
},
);
while( my $hsp = $hit->next_hsp ) {
$feature->add_sub_SeqFeature($hsp,'EXPAND');
}
$track->add_feature($feature);
}
print $panel->png;
--
Wes Barris
E-Mail: Wes.Barris at csiro.au
More information about the Bioperl-l
mailing list