[Biopython-dev] [Bug 1704] New: problem with
	Bio.Blast.NCBIStandalone
    bugzilla-daemon at portal.open-bio.org 
    bugzilla-daemon at portal.open-bio.org
       
    Mon Oct 25 06:56:56 EDT 2004
    
    
  
http://bugzilla.open-bio.org/show_bug.cgi?id=1704
           Summary: problem with Bio.Blast.NCBIStandalone
           Product: Biopython
           Version: Not Applicable
          Platform: All
        OS/Version: Linux
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Main Distribution
        AssignedTo: biopython-dev at biopython.org
        ReportedBy: gebauer-jung at ice.mpg.de
If the blast database is generated using a GI-List via a *.nal file like that:
#
TITLE insects
#
DBLIST ./nr
#
GILIST insects.list
#
the database report at the end of the blast output file looks like that:
...
Query: 2656 accaacaaaaccaacatca 2674
            |||||||||||||||||||
Sbjct: 85   accaacaaaaccaacatca 67
  Subset of the database(s) listed below
     Number of letters searched: 562,618,960
     Number of sequences searched:  228,924
  Database: insects
    Posted date:  Oct 17, 2004 10:00 PM
  Number of letters in database: 3,987,564,307
  Number of sequences in database:  991,337
  Database: /bio/blast/./nt.01
    Posted date:  Oct 17, 2004 11:04 PM
  Number of letters in database: 3,989,920,418
  Number of sequences in database:  760,163
  
  Database: /bio/blast/./nt.02
    Posted date:  Oct 18, 2004  2:00 AM
  Number of letters in database: 3,989,747,597
  Number of sequences in database:  888,596
  Database: /bio/blast/./nt.03
    Posted date:  Oct 15, 2004  1:00 AM
  Number of letters in database: 14,716,213
  Number of sequences in database:  1558
Lambda     K      H
    1.37    0.711     1.31
Gapped
Lambda     K      H
    1.37    0.711     1.31
...
The 'Subset of the database(s) ...' line lets _Scanner._scan_database_report()
crash.
Even Bio.Blast.Record cannot keep such data. (If there was some need to do so.)
As a work-around I suggest the following change in Bio.Blast.NCBIStandalone.py:
422,423c422,432
<       while 1:
<             read_and_call(uhandle, consumer.database, start='  Database')
---
>
>         while 1:
>         #      read_and_call(uhandle, consumer.database, start='  Database')
>         # work-around to skip:
>         #  Subset of the database(s) listed below
>         #  Number of letters searched: 562,618,960
>         #  Number of sequences searched:  228,924
>         #
>         # even Record.DatabaseRecord does not contain any structure to keep
this stuff
>             read_and_call_until(uhandle, consumer.database, start='  Database')
>
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
    
    
More information about the Biopython-dev
mailing list