[EMBOSS] Sequence annotation parsing and format conversion

Peter Rice ricepeterm at yahoo.co.uk
Wed Aug 15 07:53:12 UTC 2012


Dear Daniel,

On 14/08/2012 18:59, Daniel Rozenbaum wrote:
> seqret /abss/tmp/W02578.genbank -osformat2 genbank -feature Y -auto -osname W02578.emboss_genbank2genbank -osdirectory /tmp
>
> In the resultant file parts of the sequence annotation, such as fields AUTHORS, TITLE, COMMENT, and BASE COUNT are omitted, and values of some of the other fields are modified.

You are correct ... there are some gaps in the coverage of records in 
GenBank format. We will update those for the next release with the aim 
of preserving information when rewriting in GenBank format (we will aim 
to reproduce the full entry) and where possible retaining information 
when writing in EMBL format.

There are some surprising inconsistencies in the current genbank to 
genbank conversion (for example the ORGANISM record).

When tested on the EMBL version of this entry the current EMBOSS 6.5 
release reproduces the entry exactly (comparing seqret to entret) apart 
from the exact wrapping of feature annotation. We should be able to do 
the same for GenBank format.

Many thanks for the report

Peter Rice
EMBOSS Team



More information about the EMBOSS mailing list