[Biojava-l] EMBL parsing problems
saerts
saerts at mailserv.esat.kuleuven.ac.be
Tue Jan 28 17:27:25 EST 2003
Hi,
When currently parsing an exported sequence of an Ensembl mouse gene (using the
Export Data function at www.ensembl.org) there appear to be 3 problems:
(I have attached the exported sequence with gene features for Igf1)
1. Some of the exon locations start with .0:
I think this is a bug of the EMBL formatting at Ensembl?
2. The first annotation of a CDS feature is written on the next line after CDS.
This is not found by the EMBL parser.
I think that is is also a bug at Ensembl?
3. Some of the lines cannot be parsed, for example the parser writes to
System.out: "This line could not be parsed: exon 2001..2159"
This one I don't understand, I cannot see a problem for these features?
Thank you in advance!
Stein.
#########
#output when parsing the attached .embl file
################
>From must be less than To: exon .0:44020..44591
>From must be less than To: exon .0:44020..44364
>From must be less than To: exon .0:44020..44364
This line could not be parsed: exon complement(.0:13156..13348)
>From must be less than To: exon .0:46248..46337
This line could not be parsed: exon complement(245..653)
This line could not be parsed: exon 2001..2159
This line could not be parsed: exon 2003..2159
This line could not be parsed: exon 2003..2159
This line could not be parsed: exon 2003..2159
This line could not be parsed: exon 50907..51088
This line could not be parsed: exon 50907..51088
This line could not be parsed: exon 50907..51088
This line could not be parsed: exon 50907..51088
This line could not be parsed: exon 52586..52637
This line could not be parsed: exon 52586..52637
This line could not be parsed: exon 68128..69089
This line could not be parsed: exon 68128..69089
This line could not be parsed: exon 68128..69254
This line could not be parsed: exon 68132..69089
-------------- next part --------------
z'µìmjÛZrܲÇ+¹¶ÞtÖ¦z׬¶j.±ç¦nTò¥æ©¦Xjبú
µël¶·(³{^5Û¾[Ê׬
ë
Z½¨¥i¹^R¹¦*^®f¢öµ§!éí³ý´Ó}5ÛÏÝk§xç~ÿµë-¬zã
More information about the Biojava-l
mailing list