[Bioperl-l] Bio::SearchIO::hmmer
Sean O'Keeffe
limericksean at gmail.com
Wed Jun 1 09:08:02 EDT 2005
Hi,
I was wondering how Bio::SearchIO::hmmer parses hmmpfam/hmmsearch
result files. I have a set of hmmpam result hits without alignments in
a file which I generated using hmmpfam locally (-A0 option on the
command line). Is this considered a valid result by
Bio::SearchIO::hmmer? If so, what might be wrong with the code below
(which gets as far as the first while loop and doesn't enter any
other):
use strict;
use Bio::SearchIO;
my $inhmmfile = "test-hmm.smart";
my $outputfilename = "HMM-test.hmmer.parsed";
my $fastafilename = "$outputfilename".".fasta";
my $inevalue =1;
my $inlength =20;
my ($myresult,$myhit,$myhsp,$mysignificance,$mylength,
$mynohit,$mylasthit,$mylastresult,$mypercent) = 0;
unless (open(PARSEDFILE, ">$outputfilename")) {
print "Could not open file $outputfilename !\n";
exit;
}
unless (open(FASTAFILE, ">$fastafilename")) {
print "Could not open file $fastafilename !\n";
exit;
}
my $in = new Bio::SearchIO(-format => 'hmmer', -file => $inhmmfile);
while(my $result = $in->next_result ) {
$myresult++;
while (my $hit = $result->next_hit ) {
$myhit++;
while (my $hsp = $hit->next_hsp ) {
$myhsp++;
if( $hsp->length('total') >= $inlength ) {
$mylength++;
if ( $hit->significance <= $inevalue ) {
$mysignificance++;
print PARSEDFILE
$result->query_name,"\t",
$result->query_description,"\t",
$result->query_length, "\t",
$hit->description, "\t",
$hit->accession, "\t",
$hit->bits, "\t",
$hit->significance, "\t",
$hsp->num_identical, "\t",
$hsp->num_conserved,"\t",
$hsp->start('query'),"\t",
$hsp->end('query'),"\t",
$hsp->start('hit'),"\t",
$hsp->end('hit'),"\n";
print FASTAFILE "> ",
$hit->description,"\n",
$hsp->hit_string,"\n";
}
}
}
}
if ($myhit == 0) {
$mynohit++;
}
$myhit = 0;
}
$mypercent = $mynohit*100 / $myresult;
print "\n\n", $myresult, " query sequence(s)\n";
print "\n", $myhsp, " HSP sequence(s) \n";
print "\n", $mylength, " hit(s) presenting the minimum requested length\n";
print "\n", $mysignificance, " hit(s) presenting the minimum requested
E-value\n";
print "\n", $mynohit, " query sequence(s) presenting < NO HITS > ";
close (PARSEDFILE);
close (FASTAFILE);
exit;
Output is:
Use of uninitialized value in numeric eq (==) at ../hmm-test-parse.pl
line 58, <GEN1> line 40126.
1 query sequence(s)
Use of uninitialized value in print at ../hmm-test-parse.pl line 68,
<GEN1> line 40126.
HSP sequence(s)
Use of uninitialized value in print at ../hmm-test-parse.pl line 69,
<GEN1> line 40126.
hit(s) presenting the minimum requested length
Use of uninitialized value in print at ../hmm-test-parse.pl line 70,
<GEN1> line 40126.
hit(s) presenting the minimum requested E-value
1 query sequence(s) presenting < NO HITS > (100.00 %)
Thanks very much,
Sean.
More information about the Bioperl-l
mailing list