[Bioperl-l] best Hit
Rondon Neto
rondonbio at yahoo.com.br
Wed Sep 28 19:47:45 UTC 2011
Hi guys.
I have this subroutine that returns a hash with nucleotide's coverage of each query from a blast alignment.
So, I want to compute uniq hits. If a hit has already been aligned with a query, it must be eliminated from my experiment.
Can anyone check if it's right or can fix it to me? Is there a way to do that directly in blast?
Thank you
Rondon Neto
sub nucleotide_coverage{
#Bio::SearchIO dependent
#This subroutine return a Hash and a file with nucleotide coverage
#for each query in an blast alignment xlm file. The input is the
#alignment file.
my ($alignment_file) = @_;
my $alignment = new Bio::SearchIO ( -format => 'blastXML',
-file => $alignment_file );my %positions;my @used_reads;
while (my $result = $alignment->next_result) {
my $query_name = $result->query_name();
my $tam = $result -> query_length();
for (0..$tam-1){ ${$positions{$query_name}}[$_] = 0 }
while (my $hit = $result->next_hit) {
my $hit_name = $hit->name;
# Here is my best hit parser. Is it ok?
foreach my $read (@used_reads) {
if ( $read eq $hit_name ) {
next;
}
}
while (my $hsp = $hit->next_hsp) {
my $query_name = $result->query_name();
my @pos = $hsp->seq_inds('query','identical');
foreach my $num (@pos) {
${$positions{$query_name}}[$num-1]++;
}
}
push (@used_reads, $hit_name);
}
}
my $outfile = "nucleotide_coverage.txt";
open OUT, ">$outfile" or die $!;foreach my $key (keys %positions){print OUT "$key\t@{$positions{$key}}\n";
}
close OUT;
return \%positions;
}
More information about the Bioperl-l
mailing list