[Biopython-dev] [Biopython - Bug #3419] Bio.SearchIO.FastaIO

redmine at redmine.open-bio.org redmine at redmine.open-bio.org
Tue Jul 2 22:14:21 UTC 2013


Issue #3419 has been updated by Wibowo Arindrarto.


Hi Jason,

Apologies for a very long reply. Apparently the notification of your reply didn't get to my inbox and I have forgotten to check the page manually :(. Fortunately I met Peter and he pointed this out :).

IIRC, the parser does store the program name that created the results (the QueryResult.program attribute). And we can deal with strand/frame accordingly. There is, however, not a standard way to store strand information of 'parent' sequence (in this case the DNA that was the template of the protein). I'll poke around to see if this has been dealt with.

Anyway, your patch does look OK for the time being. The BLASTX parser handles this information the same way too (storing read frame in the protein Seq object). Would you like to submit it through Github? I'd be happy to commit on your behalf as well :).
----------------------------------------
Bug #3419: Bio.SearchIO.FastaIO
https://redmine.open-bio.org/issues/3419

Author: Jason Stajich
Status: New
Priority: Low
Assignee: Biopython Dev Mailing List
Category: Main Distribution
Target version: 
URL: 


The strand of the translated sequence (query or subject depending on the analysis) is lost for tfastxy and fastx/y reports.

from Bio import SearchIO
qresults = SearchIO.parse('test.FASTY.out','fasta-m10')
for qresult in qresults:
    for hit in qresult:
        for hsp in hit.hsps:
                print qresult.id, " ", hit.id, " ", \
                hsp.query_start, "..",hsp.query_end, " ", hsp.query_strand, " ", \
                hsp.hit_start, "..",hsp.hit_end, " ", hsp.hit_strand


-- 
You have received this notification because you have either subscribed to it, or are involved in it.
To change your notification preferences, please click here and login: http://redmine.open-bio.org




More information about the Biopython-dev mailing list