[EMBOSS] Support for multi-line annotation in ig format
Rozenbaum, Daniel (Biocceleration Inc)
daniel.rozenbaum at USPTO.GOV
Wed Sep 19 13:49:14 UTC 2012
Dear Peter,
This is most wonderful news that's going to make a bunch of users really happy!
I am attaching a short anonymized sample file (would a larger data set be helpful?) that illustrates the type of IG format in use at USPTO. I believe that the only reasonably indexable field is the sequence name ("US-123456789-1", "US-123456789-2", etc). While the annotation fields appear structured, that part of the information is not reliable.
As for the name, how about something like "iguspto"?
Lastly, do you think the patch with this change would be made available for EMBOSS 6.4?
With gratitude,
Daniel
--
Daniel Rozenbaum
Biocceleration, Inc.
OCIO/ Office of Application Engineering & Development/ Patent System Division
600 Dulany St.
Alexandria, VA 22314
-----Original Message-----
From: Peter Rice [mailto:ricepeterm at yahoo.co.uk]
Sent: Wednesday, September 19, 2012 6:48 AM
To: Rozenbaum, Daniel (Biocceleration Inc)
Cc: emboss at lists.open-bio.org
Subject: Re: [EMBOSS] Support for multi-line annotation in ig format
Dear Daniel,
On 18/09/2012 03:00, Rozenbaum, Daniel (Biocceleration Inc) wrote:
> Greetings again,
>
> If I may, another question on the issue of IG format: how difficult would it be to support database indexing for this format?
Very easy, a 1-day job including testing and documentation.
Could you please make some example data available, and indicate which fields could be indexed (including any information in formatted descriptions or in naming conventions), and suggest a format name (e.g.
USPTO or Biocceleration)
regards,
Peter Rice
EMBOSS Team
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: ig_uspto_sample.txt
URL: <http://lists.open-bio.org/pipermail/emboss/attachments/20120919/c9b007b9/attachment-0002.txt>
More information about the EMBOSS
mailing list