[Bioperl-l] best Hit

Rondon Neto rondonbio at yahoo.com.br
Wed Sep 28 19:47:45 UTC 2011


Hi guys. 

I have this subroutine that returns a hash with nucleotide's coverage of each query from a blast alignment.
So, I want to compute uniq hits. If a hit has already been aligned with a query, it must be eliminated from my experiment.
Can anyone check if it's right or can fix it to me? Is there a way to do that directly in blast?

Thank you

Rondon Neto


sub nucleotide_coverage{
#Bio::SearchIO dependent
#This subroutine return a Hash and a file with nucleotide coverage 
#for each query in an blast alignment xlm file. The input is the
#alignment file.

my ($alignment_file) = @_;

my $alignment = new Bio::SearchIO ( -format => 'blastXML',
    -file   => $alignment_file );my %positions;my @used_reads;
while (my $result = $alignment->next_result) {
my $query_name = $result->query_name();
my $tam = $result -> query_length();
for (0..$tam-1){ ${$positions{$query_name}}[$_] = 0 } 
while (my $hit = $result->next_hit) {
my $hit_name = $hit->name;
# Here is my best hit parser. Is it ok?
foreach my $read (@used_reads) {
if ( $read eq $hit_name ) {
next;
}
}
while (my $hsp = $hit->next_hsp) {
my $query_name = $result->query_name();
my @pos = $hsp->seq_inds('query','identical');
foreach my $num (@pos) {
${$positions{$query_name}}[$num-1]++;
}
}
push (@used_reads, $hit_name);
}
}

my $outfile = "nucleotide_coverage.txt";
open OUT, ">$outfile" or die $!;foreach my $key (keys %positions){print OUT "$key\t@{$positions{$key}}\n";
}

close OUT;

return \%positions;

}



More information about the Bioperl-l mailing list