[Bioperl-l] standalone blast outformat

dimitark at bii.a-star.edu.sg dimitark at bii.a-star.edu.sg
Thu Oct 31 05:54:00 UTC 2013


Hi guys,
thank you all for your help.
I think the GenericHit methods frac_XXXX will do the job for me. I  
somehow overlooked these ones.Sorry for that.

Now time to rewrite my program n see how it goes :)
Thank you again
D.


Quoting "Fields, Christopher J" <cjfields at illinois.edu>:

> On Oct 30, 2013, at 4:32 AM, <dimitark at bii.a-star.edu.sg>  
> <dimitark at bii.a-star.edu.sg> wrote:
>
>> Hi Florent and Jason,
>> Thank you for your replies.
>>
>> well i spent quite some time to figure that out. But still no results.
>> Yes i use BLAST+. So I tried BLASTN as it is, without bioperl. Just  
>> to see if i can do: blastn -outfmt "7 qcovs other_things". Well it  
>> works like this. But in bioperl can not pass do that. I dont know  
>> why tho.
>
> You are correct in your assumption, that you have to specify the following:
>
>     -outfmt "7 qcovs ??
>
> I?ve run into this myself.  Based on the error you are getting when  
> trying this option, I?m pretty sure the ?custom? tab-delimited  
> option isn't available in the BLAST+ wrapper (e.g. it?s a bug).
>
>> After i tried to understand how tile_hsps() works. But no results.  
>> Somehow the documentation on that matter is not clear and not a  
>> simple example is present. So i could not figure out how to use it.
>
> The tiling code in bioperl is in a much better state thanks to Mark  
> Jensen.  The hit-level methods do some of the work for you; if you  
> use any of the frac_* (frac_identical, frac_conserved,  
> frac_aligned_query) methods in GenericHit you are using it w/o  
> knowing.  The last one seems like it *might* be what you want:
>
>  Usage     : $hit_object->frac_aligned_query();
>  Purpose   : Get the fraction of the query sequence which has been aligned
>            : across all HSPs (not including intervals between non-overlapping
>            : HSPs).
>  Example   : $frac_alnq = $hit_object->frac_aligned_query();
>  Returns   : Float (2-decimal precision, e.g., 0.75).
>  Argument  : n/a
>  Throws    : n/a
>  Comments  : If you need data for each HSP, use hsps() and then interate
>            : through the HSP objects.
>            : This method requires that all HSPs be tiled. If they have not
>            : already been tiled, they will be tiled first automatically.
>
>> Now i am trying to figure out how to use FASTA or SSEARCH to solve  
>> my problem. I suppose i have to get the sequences of the HSPs  
>> representing my hit and then run ssearch like this:
>>
>>   ssearch my_query_seq.fa my_hit_hsps.fa
>>
>> Then parse the output. Just have to figure out what exactly do i  
>> need from the output :)
>
> True.  Let us know if you run into problems with that.
>
> chris
>
>> Thank you again
>> D.
>>
>> Quoting Jason Stajich <jason.stajich at gmail.com>:
>>
>>> Assuming 1 HSP you can do:
>>>
>>> my $qcov = $hsp->query->length / $query->length;
>>>
>>> Otherwise you have to add up the HSPs and their coverage (and  
>>> remove redundancy of overlapping hits) to determine what parts of  
>>> the query were aligned across all the HSPs that were found. You  
>>> can try the tile_hsps function to also get this. Or if you are  
>>> using WUBLAST you can get the -links option to grab the logical  
>>> groups of non-overlapping HSPs to tile across.
>>>
>>> Or use FASTA or SSEARCH which will provide a single longest path  
>>> through the alignment.
>>>
>>>
>>> On Oct 29, 2013, at 10:12 PM, Florent Angly  
>>> <florent.angly at gmail.com> wrote:
>>>
>>>> Hi Dimitar,
>>>>
>>>> I suppose you are using BLAST+, not legacy BLAST. There is a HOWTO
>>>> here that can get you started:
>>>> http://www.bioperl.org/wiki/HOWTO:BlastPlus
>>>> Regarding your issue, I had a look at the blastn manual, and I doesn't
>>>> look like you can write '-outfmt "7 qcovs ..."' to me. If anything, it
>>>> would be '-outfmt "qcovs ..."'.
>>>> So, maybe try to get your command to work as intended in a terminal
>>>> before putting it in BioPerl.
>>>> Best,
>>>>
>>>> Florent
>>>>
>>>>
>>>> On Wed, Oct 30, 2013 at 1:45 PM,  <dimitark at bii.a-star.edu.sg> wrote:
>>>>> Hi guys,
>>>>> i was wondering how can i get the query coverage when blasting  
>>>>> with Bioperl.
>>>>> So i figure out one way is to specify the outformat to tabular(7) and the
>>>>> specify that i want the qcovs(query coverage per subject). From  
>>>>> BLAST manual
>>>>> i see one can do: -outfmt "7 qcovs ..." but in Bioperl i can only say:
>>>>> -outformat => 7. If i do: outformat => "7 qcovs" i get an error.
>>>>>
>>>>> Well another option i suppose is to get the lengths of all hsps  
>>>>> from a hit
>>>>> and estimate the query coverage that way but im not exactly sure  
>>>>> if this is
>>>>> correct way.
>>>>>
>>>>> Any idea how can i get the query coverage?
>>>>>
>>>>> Thank you very much for your time and any hints
>>>>>
>>>>> Regards
>>>>> Dimitar
>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>> Jason Stajich
>>> jason.stajich at gmail.com
>>> jason at bioperl.org
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Chris Fields
> Technical Lead in Genome Informatics
> High Performance Computing in Biology
> W.M. Keck Center for Comparative and Functional Genomics
> Institute for Genomic Biology
> 1206 W. Gregory Dr., MC-195
> Urbana, IL 61801
> Phone: (217) 244-1890
> http://www.hpcbio.illinois.edu






More information about the Bioperl-l mailing list