[Biojava-dev] sff files

Charles Imbusch charles at imbusch.net
Mon Nov 8 13:24:19 UTC 2010


Hi all,

for a project I implemented a rudimentary support for sff files coming
from 454 sequencing machines. I packed and uploaded the code to:

http://imbusch.net/tmp/sffParser.tar

It is capable of extracting read information if the read id is known.
Certainly an iterator for the reads  and taking advantage of the mft index
structur (thanks to Peter for information) is necessary.

An example code to extract a sequence:

String sfffile = "/home/charlie/sff/Harmigera/EU97XD416.sff";
sffParser sffparser = new sffParser(sfffile);
System.out.println("number of reads: " + sffparser.get_number_of_reads());
Read read = sffparser.get_Read("EU97XD416JXTCU");
System.out.println("sequence for read EU97XD416JXTCU");
System.out.println(read.get_bases());

I would like to extend and integrate the code into BioJava but I'm a bit
unsure on how to proceed. Especially the Read class was a quick solution
for me. Maybe there is already something existing to manage reads and
their quality scores?

Any feedback is welcome!

Cheers,
  Charles





More information about the biojava-dev mailing list