[Biojava-dev] reading a subsequence from a .nib file
mark.schreiber at novartis.com
mark.schreiber at novartis.com
Tue Apr 3 01:03:20 UTC 2007
Hi -
Too my knowledge nothing like this exists in BioJava. Could someone take
it the last mile and make it produce SymbolLists?
- Mark
Mark Schreiber
Research Investigator (Bioinformatics)
Novartis Institute for Tropical Diseases (NITD)
10 Biopolis Road
#05-01 Chromos
Singapore 138670
www.nitd.novartis.com
phone +65 6722 2973
fax +65 6722 2910
Josh Burdick <jburdick at keyfitz.org>
Sent by: biojava-dev-bounces at lists.open-bio.org
01/23/2007 12:29 AM
To: biojava-dev at lists.open-bio.org
cc: (bcc: Mark Schreiber/GP/Novartis)
Subject: [Biojava-dev] reading a subsequence from a .nib file
I wrote some code to read a chunk of DNA sequence from a file in Jim
Kent's blat ".nib" file format. This is a simple format using four
bits/base.
I didn't attach the code, to avoid spamming the whole list; but it,
and a (very crude!) JUnit test, are at
http://www.keyfitz.org/jburdick/read_nib_file_java/NibFile.java
http://www.keyfitz.org/jburdick/read_nib_file_java/NibFileTest.java
You could use 2 bits/base, but then you can't have ambiguous bases. 4
bits/base seems like a reasonable compromise; plus sites that have
"blat" installed will need to have the .nib files on a server somewhere
anyway, and this way repeat-masking can be included, which may be
convenient.
Also, it doesn't support writing a .nib file; again, presumably people
will be using Jim Kent's faToNib program to do that.
It would need some tweaking to be included in BioJava, because it
returns a plain String of ACGT, instead of a PackedSequence object.
(Probably this would just involve rewriting the setupBuffer() and
addToBuffer() methods in the code.) Also, the coordinate information
could come from a Range object.
If similar code is already somewhere in BioJava, please ignore this;
but I couldn't find it with thirty seconds of Googling, so I figured it
hadn't been written...
Josh Burdick
programmer, Vivian Cheung's lab, Children's Hospital of Philadelphia
jburdick at keyfitz.org
_______________________________________________
biojava-dev mailing list
biojava-dev at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biojava-dev
More information about the biojava-dev
mailing list