[Biojava-l] Reading a fasta file which is not encoded in ansi

Richard Holland richard.holland at ebi.ac.uk
Fri Apr 28 10:37:35 UTC 2006


I've no idea what binary format that file is in - it contains some very
strange characters. It appears to contain _some_ ANSI data but with
extra binary bits added to the start and end. I think you need to check
the program that generated the file as it is obviously not doing what it
is supposed to.

Your best bet is to convert the file to ANSI or some other format
understood out-of-the-box by Java.

cheers,
Richard

On Fri, 2006-04-28 at 11:09 +0200, Ilhami Visne wrote:
> i got a file in fasta format, which is not encoded in ansi. but it seems ok.
> it can be downloaded here: http://stud3.tuwien.ac.at/~e0125935/try3.fasta
> i tried to read it with SeqIOTools.readFastaDNA and this exception was
> thrown:
> 
> org.biojava.bio.BioException: Could not read sequence
>     at org.biojava.bio.seq.io.StreamReader.nextSequence(StreamReader.java
> :104)
> ..............
> ..............
> Caused by: java.io.IOException: Stream does not appear to contain FASTA
> formatted data: ÿþ>
> org.biojava.bio.seq.io.FastaFormat.readSequence(FastaFormat.java:112)
>  at org.biojava.bio.seq.io.StreamReader.nextSequence(StreamReader.java:101)
> 
> "ÿþ>" there is no row like this but it seems it is hidden.
> 
> How should i handle such files?
> 
> thax in advance.
> 
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biojava-l
> 
-- 
Richard Holland (BioMart Team)
EMBL-EBI
Wellcome Trust Genome Campus
Hinxton
Cambridge CB10 1SD
UNITED KINGDOM
Tel: +44-(0)1223-494416




More information about the Biojava-l mailing list