[EMBOSS] Support for multi-line annotation in ig format

Rozenbaum, Daniel (Biocceleration Inc) daniel.rozenbaum at USPTO.GOV
Fri Sep 14 12:56:14 UTC 2012


Hello Peter and everyone,

I was wondering if I could  revive the discussion about the support of IG format if possible. I'm helping deploy EMBOSS at the US Patent and Trademark Office, where this format, in its multi-line sequence annotation form, is used extensively.

Here's an example of an additional issue I've run into when trying to work with IG format in EMBOSS:

% makeprotseq -amount 10 -length 10 -nouseinsert -osformat ig -auto -osname ig1

% cat ig1.ig
;, 10 bases
EMBOSS_001
hcsptpstas1
;, 10 bases
EMBOSS_002
rdgwcvmtrm1
;, 10 bases
EMBOSS_003
fgtifgdgid1
<snip>

% entret  -sequence ig1.ig:EMBOSS_001 -nofirstonly -auto -stdout
;, 10 bases
EMBOSS_001
hcsptpstas1
;, 10 bases

In the entret result above the first annotation line of the subsequent record is returned as part of the requested record.

Many thanks,
Daniel
--
Daniel Rozenbaum
Biocceleration, Inc.
OCIO/ Office of Application Engineering & Development/ Patent System Division 
600 Dulany St.
Alexandria VA 22314

-------------------------
On 15/08/2012 17:57, Daniel Rozenbaum wrote:
> Dear list,
>
> (Peter, many thanks for your prompt reply to my previous inquiry!)
>
> We need to deal with extensive databases in Intelligenetics format with multiple lines in annotation of each record. It appears however that EMBOSS concatenates all annotation lines into a single line when building its internal representation of the sequence description:
>
> % cat /tmp/IGSEQ.ig
> ; Annotation line 1
> ; Annotation line 2
> ; Annotation line 3
> IGSEQ
> ACGCATCGCATCAGACTACGC1
>
>
> % seqret /tmp/IGSEQ.ig -osformat2 ig -auto -osname IGSEQ.emboss_ig2ig -osdirectory /tmp
>
>
> % cat /tmp/IGSEQ.emboss_ig2ig.ig
> ;Annotation line 1 Annotation line 2 Annotation line 3, 21 bases
> IGSEQ
> ACGCATCGCATCAGACTACGC1
>
> Are there any plans to support multi-line annotation in this format?

Interesting thought. We will take a look. It will need some care to 
maintain compatibility with other formats that have single (FASTA) or 
multiple (swissprot) descriptions.

Which package is using this IG format?

regards,

Peter Rice
EMBOSS Team






More information about the EMBOSS mailing list