[Bioperl-l] blast and length adjustment
dimitark at bii.a-star.edu.sg
dimitark at bii.a-star.edu.sg
Fri Aug 2 04:01:06 UTC 2013
Hi guys,
i have a question about Blast.
I was working on some project where i blast using Bioperl against the
human-RNA. So i found 2 sequences which hit on totally different RNAs
but when i used cd-hit-est they cluster together. I even aligned them
and they were almost identical, from NCBI aligner:
2658 bits(1439) 0.0 1441/1442(99%) 0/1442(0%) Plus/Plus
Then i decided to blast them on NCBI and they again hit on different
sequences.
Then i checked the parameters of each search and found that both
queries were length adjusted aka some length was removed, namely
around 30 nucleotides.
Well it was interesting to see what bioperl does about that so i found
the following in BlastUtils.pm:
# Adjust length based on BLAST flavor.
my $prog = $sbjct->algorithm;
if($prog eq 'TBLASTN') {
$sbjct->{'_length_aln_sbjct'} /= 3;
} elsif($prog eq 'BLASTX' ) {
$sbjct->{'_length_aln_query'} /= 3;
} elsif($prog eq 'TBLASTX') {
$sbjct->{'_length_aln_query'} /= 3;
$sbjct->{'_length_aln_sbjct'} /= 3;
}
But seems there is no length adjustment for blastn as it seems to
exist on NCBI.
Its kind of frustrating as i am trying to do some differential
expression analysis with my own scripts. But then if these 2 seqs are
so identical they should have the same annotation but they do not cos
of that strange blast results.
I am really sorry if my post is a bit messy. If you have any questions
on what i meant please ask.
Any comments would be greatly appreciated!
Cheers
D.
More information about the Bioperl-l
mailing list