[EMBOSS] Unknown output format 'refseqp' and 'genpept'

Peter biopython at maubp.freeserve.co.uk
Tue Dec 8 14:29:25 UTC 2009


On Tue, Dec 8, 2009 at 2:11 PM, Peter Rice <pmr at ebi.ac.uk> wrote:
>
>> With hindsight this may have been a mistake, but we use "genbank"
>> format to mean either nucleotides of proteins. On parsing we just
>> look at the units of length in the LOCUS line (bp or aa). We also
>> try to cope with both the current NCBI files and some older variants
>> we have in our unit tests (different offsets in the LOCUS line).
>
> We try that too on input, but for output we have to be explicit so the user
> can pick just one of the choices.

I imagine that as with Biopython, sometimes the user has made it
explicit that they are dealing with nucleotides or proteins (lots of
the EMBOSS tools have switches for this), so you know if you
should be using "aa" or "bp" in the LOCUS line.

Peter



More information about the EMBOSS mailing list