[Bioperl-l] Announcing Bio::SFF

Peter Cock p.j.a.cock at googlemail.com
Mon Dec 19 16:15:15 UTC 2011


On Mon, Dec 19, 2011 at 3:48 PM, Leon Timmermans
<l.m.timmermans at students.uu.nl> wrote:
> On Mon, Dec 19, 2011 at 3:31 PM, Peter Cock wrote:
>
>> Have you looked at the sample SFF data in Biopython? Please
>> use them for the BioPerl unit tests (we're been talking about a
>> cross project collection of test data files like this), the README
>> file should be self-explanatory:
>> https://github.com/biopython/biopython/tree/master/Tests/Roche
>
> Yeah, I'm using those now
> (https://github.com/Leont/bio-sff/blob/master/t/reader.t).

Could you a link to your /corpus/README.txt file pointing
back to the Biopython original for acknowledgement and
future reference?

>
> I must say there were some interesting corner cases in it.
>

I'm glad you agree - and if you can think of any more special
cases to verify that would be great.

Are you doing just SFF parsing for now? Not writing?

Now, as to Bio::SeqIO integration, Biopython's SeqIO uses
format name "sff" to mean the full read sequence (with mixed
case, upper case for the good sequence, lower cases for any
left/right clipping - as in the Roche tools), and "sff-trim" to mean
the trimmed sequences. I would encourage you to do the
same, as part of the general aim of having consistent
sequence format names between BioPerl, Biopython, and
EMBOSS, where possible.

Peter



More information about the Bioperl-l mailing list