[Bioperl-l] Changes in FASTA output format

Chris Fields cjfields at uiuc.edu
Sat Mar 31 00:51:38 UTC 2007


On Mar 30, 2007, at 3:24 PM, David Messina wrote:

>> I could even imagine tagging the lines:
>>
>>   Algorithm:  Smith-Waterman (SSE2, Michael Farrar 2006) (6.0 Mar
>> 2007)
>>   Parameters:  BL50 matrix (15:-5), open/ext: -12/-2
>> Scan time:  2.140
>
> IMO, tagged lines would be great and make parsing very easy.
>
>
>> (2)  I am also thinking about displaying multiple E()-values,
>> depending on whether they are calculated from the similarity search
>> or the shuffled high scores, e.g., going from:
>>
>> [...]
>>
>> I think this output would break many more FASTA parsers, and one
>> option would be (initially) to add it only to the alignment output.
>
> Agreed, but...
>
>
>> Naturally, initially it will be easy to revert to the classic format.
>
> I think the backwards compatibility you describe here would take care
> of those cases.
>
>
> My two cents (and thanks for asking :),
> Dave
>
> --
> Dave Messina
> Senior Analyst, Assembly Group
> Genome Sequencing Center
> Washington University
> St. Louis, MO

If it ever becomes a problem we can pass off the flow of parsing to  
specific parser methods (one for the old version, one for the new) or  
just try to evaluate them separately (ala SearchIO::blast).  If there  
are tags that make new format distinguishable from the old, such as  
the "Algorithm:" or "Parameters:" above, then that would be a good  
point to catch the difference and pass off to the appropriate method.

We'll need to add this to the Project Priority List...

chris



More information about the Bioperl-l mailing list