[Biojava-l] BufferedReader and FastaFormat
Thomas Down
td2@sanger.ac.uk
Mon, 4 Mar 2002 23:54:58 +0000
On Tue, Mar 05, 2002 at 10:07:11AM +1300, Schreiber, Mark wrote:
>
> java.io.IOException: Mark invalid
> at java.io.BufferedReader.reset(BufferedReader.java:467)
> at
> org.biojava.bio.seq.io.FastaFormat.readSequenceData(FastaFormat.java:164
> )
> at
> org.biojava.bio.seq.io.FastaFormat.readSequence(FastaFormat.java:121)
> at
> org.biojava.bio.seq.io.StreamReader.nextSequence(StreamReader.java:100)
> rethrown as org.biojava.bio.BioException: Could not read sequence
> at
> org.biojava.bio.seq.io.StreamReader.nextSequence(StreamReader.java:103)
I've heard a few problems with the mark/restore sytem
on BufferedReader. They can usually be fixed by setting
the mark-validity a bit further ahead than you're actually
planning to read.
I wonder if your file contains any DOS-type line endings
(\r\n). There was one case where someone was having trouble
with EMBL parsing, and fixed it by stripping these out.
Alternatively, you could just try changing:
r.mark(cache.length);
To:
r.mark(cache.length + 50); // fudge factor.
It's not a particularly satisfying solution until we understand
exactly where the marks are falling out of scope, but I bet
it will get your file parsing...
Thomas.