[Biopython-dev] [Bug 2143] New: Error parsing BLAT output (using out=blast format)

bugzilla-daemon at portal.open-bio.org bugzilla-daemon at portal.open-bio.org
Tue Nov 14 20:48:49 UTC 2006


http://bugzilla.open-bio.org/show_bug.cgi?id=2143

           Summary: Error parsing BLAT output (using out=blast format)
           Product: Biopython
           Version: Not Applicable
          Platform: PC
        OS/Version: Linux
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Main Distribution
        AssignedTo: biopython-dev at biopython.org
        ReportedBy: fgibbons at hms.harvard.edu


Attempting to parse this BLAT output (see below) raises an "I couldn't find the
sbjct in" exception.

After looking at the code, it seems to me that the problem is an overly strict
regexp, that relies on a single space between the "Sbjct:" and the integer that
follows it. Replace the literal space with '\s*', and it goes away. This in
fact matches the regexp used to match the "Query:". I can't imagine that it
might hurt things, even in the main NCBIBlastParser, but you never know.... 

(All of the above refers to the method sbjct in class _HSPConsumer, file
NCBIStandalone.py)

-Frank Gibbons (fgibbons at hms.harvard.edu)
-------------------------------------

Reference:  Kent, WJ. (2002) BLAT - The BLAST-like alignment tool

Query=  NCU00001
        (54 letters)

Database:  all_proteins.fasta
           293697 sequences; 128,064,135 total letters

                                                                 Score    E
Sequences producing significant alignments:                      (bits) Value

MGG_10872.5                                                           101  
1e-21



>MGG_10872.5
          Length = 245

 Score = 101 bits (260), Expect = 1e-21
 Identities = 54/54 (100%), Positives = 54/54 (100%), Gaps = 0/54 (0%)

Query:   1 MAINSGTRRLKNSVYNPLAEISVYVGKIKISLIEVISNIVKEKNPEVFIIRIRL 54
           MAINSGTRRLKNSVYNPLAEISVYVGKIKISLIEVISNIVKEKNPEVFIIRIRL
Sbjct: 192 MAINSGTRRLKNSVYNPLAEISVYVGKIKISLIEVISNIVKEKNPEVFIIRIRL 245

  Database: all_proteins.fasta
BLASTP 2.2.4 [blat]


-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.



More information about the Biopython-dev mailing list