[EMBOSS] Sequence annotation parsing and format conversion
ricepeterm at yahoo.co.uk
Wed Aug 15 07:53:12 UTC 2012
On 14/08/2012 18:59, Daniel Rozenbaum wrote:
> seqret /abss/tmp/W02578.genbank -osformat2 genbank -feature Y -auto -osname W02578.emboss_genbank2genbank -osdirectory /tmp
> In the resultant file parts of the sequence annotation, such as fields AUTHORS, TITLE, COMMENT, and BASE COUNT are omitted, and values of some of the other fields are modified.
You are correct ... there are some gaps in the coverage of records in
GenBank format. We will update those for the next release with the aim
of preserving information when rewriting in GenBank format (we will aim
to reproduce the full entry) and where possible retaining information
when writing in EMBL format.
There are some surprising inconsistencies in the current genbank to
genbank conversion (for example the ORGANISM record).
When tested on the EMBL version of this entry the current EMBOSS 6.5
release reproduces the entry exactly (comparing seqret to entret) apart
from the exact wrapping of feature annotation. We should be able to do
the same for GenBank format.
Many thanks for the report
More information about the EMBOSS