[Biopython-dev] Fwd: [blast-announce] Correction: BLAST 2.2.24 release announcement

Michiel de Hoon mjldehoon at yahoo.com
Sat Sep 4 15:23:16 UTC 2010


Being able to convert Blast ASN.1 output into any of the other formats will make a big difference to us. If we had a parser for ASN.1 Blast output, then strictly speaking there is no reason to have a parser for any of the other formats (in practice, we can be more flexible of course). 

I looked some more into the Blast parser issues we discussed earlier (starting here: http://lists.open-bio.org/pipermail/biopython-dev/2010-May/007762.html). Unfortunately things are not as easy as I had hoped. Except for the new ASN.1 output format, none of the other output formats (plain text, XML, tabular) contain all of the output generated by the Blast run. Some results are only found in the XML, some only in the plain text output, and tabular output can contain all kinds of stuff depending on the exact options that were used. As a consequence, it's hard to design a generic Blast record class; having a specialized Record class for plain text, XML, and tabular seems more appropriate, and these record classes may not be fully consistent with each other (some elements may exist in one class but not in the other). Also, we cannot read in the Blast output in one format and write out the Blast output in a different format (at least not reliably).

With the format converter in Blast 2.2.24, luckily there is no longer such a need for such converters in Biopython. If we had an ASN.1 parser, we could run Blast, save its output in ASN.1, load the Blast output into Python, filter the Blast output or otherwise modify it, write out the modified output in ASN.1 format, and then use the Blast 2.2.24 format converter to convert the modified output to plain text or some other format. That would be really useful.

Unfortunately, making a parser for ASN.1 will not be so easy. As far as I know there isn't anything like expat or DOM for ASN.1 like we have for XML. Maybe this is something for a google summer of code?

--Michiel.

--- On Tue, 8/24/10, Peter <biopython at maubp.freeserve.co.uk> wrote:

> From: Peter <biopython at maubp.freeserve.co.uk>
> Subject: [Biopython-dev] Fwd: [blast-announce] Correction: BLAST 2.2.24 release announcement
> To: "Biopython-Dev Mailing List" <biopython-dev at biopython.org>
> Date: Tuesday, August 24, 2010, 12:30 PM
> Hi all,
> 
> The NCBI have just released a new version of BLAST+ (see
> below).
> 
> I've just updated the existing BLAST+ application wrappers
> for the minor
> changes made in BLAST 2.2.24+.
> 
> Something potentially quite useful in this release is the
> blast_formatter
> command for turning ASN.1 BLAST+ output (using –outfmt
> 11) into
> any of the other output formats. i.e. If you are not sure
> what output
> format will be most useful (e.g. plain text, XML, tabular)
> and rerunning
> the BLAST is slow, the NCBI now let you run the BLAST once
> and save
> it as ASN.1, then convert this to any other format on
> demand using
> blast_formatter (which should be fast).
> 
> We should write a command line wrapper for this new
> tool...
> 
> Peter
> 
> ---------- Forwarded message ----------
> From: mcginnis <mcginnis at ncbi.nlm.nih.gov>
> Date: Tue, Aug 24, 2010 at 4:46 PM
> Subject: [blast-announce] Correction: BLAST 2.2.24 release
> announcement
> To: NLM/NCBI List blast-announce <blast-announce at ncbi.nlm.nih.gov>
> 
> 
> A new version of the stand-alone applications is
> available.
> 
> Users are encouraged to use the BLAST+ applications
> available at
> ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/
> This release includes a number of bug fixes as well as new
> features
> for the BLAST+ applications:
> 
> * Introduce BLAST Archive format to permit reformatting
> of stand-alone
> BLAST searches with the blast_formatter(see BLAST+ user
> manual)
> * Added the blast_formatter application (see BLAST+ user
> manual)
> * Added support for translated subject soft masking in the
> BLAST databases
> * Added support for the BLAST Trace-back operations (btop)
> output format
> * Added command line options to blastdbcmd for listing
> available BLAST databases
> * Improved performance of formatting of remote BLAST
> searches
> * Use a consistent exit code for out of memory conditions
> * Fixed bug in indexed megablast with multiple
> space-separated BLAST databases
> * Fixed bugs in legacy_blast.pl, blastdbcmd, rpsblast, and
> makeblastdb
> * Fixed Windows installer for 64-bit installations
> 
> BLAST+ applications, as well as the legacy C applications
> (e.g.
> blastall), may be downloaded from
> http://www.ncbi.nlm.nih.gov/blast/Blast.cgi?CMD=Web&PAGE_TYPE=BlastDocs&DOC_TYPE=Download
> 
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython-dev
> 


      




More information about the Biopython-dev mailing list