[Bioperl-l] Changes in FASTA output format
Chris Fields
cjfields at uiuc.edu
Sat Mar 31 00:51:38 UTC 2007
On Mar 30, 2007, at 3:24 PM, David Messina wrote:
>> I could even imagine tagging the lines:
>>
>> Algorithm: Smith-Waterman (SSE2, Michael Farrar 2006) (6.0 Mar
>> 2007)
>> Parameters: BL50 matrix (15:-5), open/ext: -12/-2
>> Scan time: 2.140
>
> IMO, tagged lines would be great and make parsing very easy.
>
>
>> (2) I am also thinking about displaying multiple E()-values,
>> depending on whether they are calculated from the similarity search
>> or the shuffled high scores, e.g., going from:
>>
>> [...]
>>
>> I think this output would break many more FASTA parsers, and one
>> option would be (initially) to add it only to the alignment output.
>
> Agreed, but...
>
>
>> Naturally, initially it will be easy to revert to the classic format.
>
> I think the backwards compatibility you describe here would take care
> of those cases.
>
>
> My two cents (and thanks for asking :),
> Dave
>
> --
> Dave Messina
> Senior Analyst, Assembly Group
> Genome Sequencing Center
> Washington University
> St. Louis, MO
If it ever becomes a problem we can pass off the flow of parsing to
specific parser methods (one for the old version, one for the new) or
just try to evaluate them separately (ala SearchIO::blast). If there
are tags that make new format distinguishable from the old, such as
the "Algorithm:" or "Parameters:" above, then that would be a good
point to catch the difference and pass off to the appropriate method.
We'll need to add this to the Project Priority List...
chris
More information about the Bioperl-l
mailing list