[Biojava-l] .sff support

Peter biopython at maubp.freeserve.co.uk
Tue Feb 9 00:59:37 UTC 2010


> 2010/2/8 Charles Imbusch <charles at imbusch.net>
>> Hello,
>>
>> I have been wondering whether Biojava is able to
>> handle sff files coming from 454 sequencing runs.
>>
>> I found something here:
>> http://lists.open-bio.org/pipermail/biojava-dev/2009-July/003907.html
>>
>> Does somebody know about the current status on Biojava and sff files?
>>
>>
>> Thanks in advance,
>>  Charles

On Mon, Feb 8, 2010 at 9:24 PM, Paolo Pavan <paolo.pavan at gmail.com> wrote:
>
> Unfortunately, after spending some time on it, I didn't anything, sorry.
> There is just a post more I sent to Andreas Prlic without enclose the list
> by mistake, in which I report a few info more, coming from my reading on
> BioPerl's way to manage contigs and assembly informations.
> Nothing more.
>
> Paolo

Hi,

I've CC'd the common OpenBio mailing list as this is probably of
interest beyond just BioJava.

Based on code from Jose Blanca (author of sff_extract), I
implemented support for the SFF (Roche 454) sequencing reads
for Biopython last year on a branch that I hope to merge into our
next release, currently here:

http://github.com/peterjc/biopython/tree/sff-seqio

In addition to the Roche Manuals (which may not be that easy to
get a copy of), the SFF format is described on this NCBI webpage:

http://www.ncbi.nlm.nih.gov/Traces/trace.cgi?cmd=show&f=formats&m=doc&s=formats#sff

I'm happy to answer questions on how the file format works
(including the undocumented index block which I had to reverse
engineer).

Peter

P.S. Just to clarify (from the old BioJava thread), the SFF file just
holds the raw reads - it is an input file for doing an assembly or
mapping.



More information about the Biojava-l mailing list