[Biojava-l] Genpept format - missing features

Christian Storm cs at is-analyse.de
Wed Dec 8 06:41:10 EST 2004


Hi,

using the following code

Sequence seq = sequences.nextSequence();
SeqIOTools.writeGenpept(System.out,seq);

I tried to write out a sequence in Genpept format. The sequence was
previoulsy parsed in from a Genpept (NCBI protein) - format file.
(http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=protein&val=13568390)
Biojava 1.4pre1 gives me some output, but is missing all features. I then 
tried
Biojava1.3, with even worse results:

Exception in thread "main" org.biojava.bio.symbol.IllegalSymbolException:
Symbol ARG not found in alphabet DNA
at 
org.biojava.bio.symbol.AbstractAlphabet.validate(AbstractAlphabet.java:278)
...


I had a brief look at the biojava-sourcecode. It seems to me that Genpept 
format files are not really expected to stand on their own? I.e. protein - 
sequences are only handled correct if part of a Genbank (nucleotide) file?
That would explain why I get an IllegalSymbolException - a DNA sequence is 
expected where in the Genpept format the AA sequence is. Strangely enough 
the file is parsed in correctly ... .

Or is there something I am missing?

Thanks in advance!
Christian 



More information about the Biojava-l mailing list