[Bioperl-l] blast and length adjustment
dimitark at bii.a-star.edu.sg
dimitark at bii.a-star.edu.sg
Mon Aug 5 01:21:54 UTC 2013
Hi Jason,
no i was not interpreting wrongly. I just found something about length
correction only about those blast methods. Did not find length
correction for blastn method even tho on NCBI site i see they apply
some length correction.
So when i blast locally with Bioperl and Blastn i get one result and
when i blast with blastn on NCBI i get a different result.
So i was wondering if there is such length correction in Bioperl
concerning blastn. I could not find. Also was wondering if such
correction should be implemented for blastn?
Well thank you for your reply!
Cheers
Dimitar
Quoting Jason Stajich <jason.stajich at gmail.com>:
> On Aug 1, 2013, at 9:01 PM, dimitark at bii.a-star.edu.sg wrote:
>
>> Hi guys,
>> i have a question about Blast.
>>
>> I was working on some project where i blast using Bioperl against
>> the human-RNA. So i found 2 sequences which hit on totally
>> different RNAs but when i used cd-hit-est they cluster together. I
>> even aligned them and they were almost identical, from NCBI aligner:
>>
>> 2658 bits(1439) 0.0 1441/1442(99%) 0/1442(0%) Plus/Plus
>>
>> Then i decided to blast them on NCBI and they again hit on
>> different sequences.
>> Then i checked the parameters of each search and found that both
>> queries were length adjusted aka some length was removed, namely
>> around 30 nucleotides.
>>
>> Well it was interesting to see what bioperl does about that so i
>> found the following in BlastUtils.pm:
>>
>> # Adjust length based on BLAST flavor.
>> my $prog = $sbjct->algorithm;
>> if($prog eq 'TBLASTN') {
>> $sbjct->{'_length_aln_sbjct'} /= 3;
>> } elsif($prog eq 'BLASTX' ) {
>> $sbjct->{'_length_aln_query'} /= 3;
>> } elsif($prog eq 'TBLASTX') {
>> $sbjct->{'_length_aln_query'} /= 3;
>> $sbjct->{'_length_aln_sbjct'} /= 3;
>> }
>
> You are wrongly interpreting the length adjustment that happens at
> NCBI with this length adjustment. The code above is to deal with
> translated searches - notice they all are division by 3 because the
> coordinates presented in the BLAST results for a translated search
> will be the original DNA/RNA coords but when wants to know what the
> length is in the alignment space it is really at the protein scale.
>
> So this is not the adjustment you seem to be looking for.
>>
>> But seems there is no length adjustment for blastn as it seems to
>> exist on NCBI.
>>
>> Its kind of frustrating as i am trying to do some differential
>> expression analysis with my own scripts. But then if these 2 seqs
>> are so identical they should have the same annotation but they do
>> not cos of that strange blast results.
>
> No idea what you mean by the rest of this when it comes to your
> candidate RNA sequences or what you are seeking to find from the
> BLAST searches to help you on that front.
>>
>> I am really sorry if my post is a bit messy. If you have any
>> questions on what i meant please ask.
>>
>> Any comments would be greatly appreciated!
>>
>> Cheers
>> D.
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Jason Stajich
> jason.stajich at gmail.com
> jason at bioperl.org
More information about the Bioperl-l
mailing list