[Bioperl-l] Fwd: FASTA version numbers
Nathan Haigh
N.Haigh at sheffield.ac.uk
Sat Jan 13 10:05:36 UTC 2007
Before the 1.5.2 release there was some talk about being able to obtain version numbers from FASTA that could be easily compared computationally.
Unfortunately, FASTA contained none numeric characters and also didn't output the full version number. I mentioned the problems and a made a couple
of suggestions. I have now just recieved this replay from Bill Peason (the author of the FASTA programs) and thought I'd post it to the list FYI.
Nath
----- Forwarded message from "William R. Pearson" <wrp at virginia.edu> -----
Date: Fri, 12 Jan 2007 15:59:33 -0500
From: "William R. Pearson" <wrp at virginia.edu>
Reply-To: "William R. Pearson" <wrp at virginia.edu>
Subject: FASTA version numbers
To: n.haigh at sheffield.ac.uk
Several people have asked me to simplify (or perhaps just
rationalize) the version numbers used by FASTA (see below). The
version string makes some sense to me (and should be logged in the
readme.v34t0 file), but I can see why it causes problems.
With the next release, I will go to a new system - my CVS tags will
be of the form fasta-34_26_x (since CVS does not allow decimal points
in a version tag) and, within the program output, you will see
fasta-34.26 (typically without the last number, but with a date).
The actual filenames on the FTP site will be fasta-34.26.2.shar.Z or
fasta-34.26.2.tgz.
This should address most of the problems.
However, part of the problem is that there are several versions
associated with the program - in particular the version printed at
the beginning of the output:
=====================================
SSEARCH searches a sequence data bank
version 34.26 January 12, 2007
=====================================
and at the end:
=====================================
218 residues in 1 query sequences
83566858 residues in 223447 library sequences
Tcomplib [34.26] (2 proc)
start: Fri Jan 12 15:22:33 2007 done: Fri Jan 12 15:22:39 2007
Total Scan time: 11.490 Total Display time: 0.130
Function used was SSEARCH [version 34.26 January 12, 2007]
=====================================
which looks different from the one printed in the algorithm description:
=====================================
Smith-Waterman (SSE2, Michael Farrar 2006) (5.5 Sept 2006) function
[BL50 matrix (15:-5)xS], open/ext: -10/-2
=====================================
or
=====================================
FASTX (3.5 Sept 2006) function [optimized, BL50 matrix (o=15:-5:-1:-1)
xS] ktup:2
=====================================
The algorithm version strings and dates have little to do with each
other because the algorithms are revised much more rarely than the
main wrapper programs.
Hopefully, the new system will make it easier to keep track of things.
Bill Pearson
Begin forwarded message:
> From: "Nathan S. Haigh" <n.haigh at sheffield.ac.uk>
> Date: November 6, 2006 12:52:25 PM EST
> Subject: FASTA versioning
>
> Dear Prof. Pearson,
>
> I am trying to extend a Bioperl module that works as a wrapper for the
> FASTA programs. I am currently, trying to build a subroutine that
> extracts the version number of the installed FASTA program for later
> comparison. However, because of the nature of the versoning system
> that
> you have employed it makes it difficult to do a computational
> comparison
> of the version strings extracted as they are not a pure floating point
> number.
>
> Please could you let me know your current versioning scheme for FASTA
> releases is? I.e. what does t24 mean what does b2 mean? This might
> then
> allow me to attempt to do version comparisons successfully. In
> addition,
> it appears that the version printed when you start one of the programs
> does not display the full version information (as indicated by the
> downloaded file) and also shows what I assume is a release date?
> e.g. version 3.4t26 July 7, 2006 rather than 3.4t26b2 as indicated by
> the downloaded file (I'm not even sure if this version matches the
> filename - just an example)
>
> A more standard floating point number (3.4262) or 3-4 numbers
> separated
> by decimal points (3.4.26.2) would make computational comparisons far
> easier.
>
> Kind regards
> Nathan
>
----- End forwarded message -----
More information about the Bioperl-l
mailing list