[Biojava-l] Reading a fasta file which is not encoded in ansi

Richard Holland richard.holland at ebi.ac.uk
Fri Apr 28 10:37:35 UTC 2006

I've no idea what binary format that file is in - it contains some very
strange characters. It appears to contain _some_ ANSI data but with
extra binary bits added to the start and end. I think you need to check
the program that generated the file as it is obviously not doing what it
is supposed to.

Your best bet is to convert the file to ANSI or some other format
understood out-of-the-box by Java.


On Fri, 2006-04-28 at 11:09 +0200, Ilhami Visne wrote:
> i got a file in fasta format, which is not encoded in ansi. but it seems ok.
> it can be downloaded here: http://stud3.tuwien.ac.at/~e0125935/try3.fasta
> i tried to read it with SeqIOTools.readFastaDNA and this exception was
> thrown:
> org.biojava.bio.BioException: Could not read sequence
>     at org.biojava.bio.seq.io.StreamReader.nextSequence(StreamReader.java
> :104)
> ..............
> ..............
> Caused by: java.io.IOException: Stream does not appear to contain FASTA
> formatted data: ÿþ>
> org.biojava.bio.seq.io.FastaFormat.readSequence(FastaFormat.java:112)
>  at org.biojava.bio.seq.io.StreamReader.nextSequence(StreamReader.java:101)
> "ÿþ>" there is no row like this but it seems it is hidden.
> How should i handle such files?
> thax in advance.
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biojava-l
Richard Holland (BioMart Team)
Wellcome Trust Genome Campus
Cambridge CB10 1SD
Tel: +44-(0)1223-494416

More information about the Biojava-l mailing list