[Bioperl-l] genbank/embl format ebnf or other formal description
Fields, Christopher J
cjfields at illinois.edu
Tue Sep 11 14:39:19 UTC 2012
Christopher,
I think Dan's question is orthogonal to actually parsing a file; it relates more to proper formatting for a particular format based on a specification as well as potential downstream validation. Bio::SeqIO::genbank is geared for flexibility and can handle a lot of mis-formatted data, it can massage some data into the proper format if needed. One must recognize the primary driver for the parsers is to get data into objects, not as a format converter (that just happens to be a nice useful side effect).
The problem is, like many formats, a formal specification for Genbank format doesn't exist outside of the NCBI example file (old and incomplete) and the FT definition as far as I know, so calling something 'official' Genbank format isn't possible outside of NCBI.
chris (f)
On Sep 11, 2012, at 9:10 AM, Christopher Bottoms <molecules at cpan.org> wrote:
> Dan,
>
> Why not use BioPerl's Bio::SeqIO, which can parse GenBank files?
>
> --Christopher Bottoms
>
> On Fri, Sep 7, 2012 at 10:43 PM, Dan Kortschak
> <dan.kortschak at adelaide.edu.au> wrote:
>> Thanks Chris. That's remarkable, so many words and not an actual formal
>> specification. I guess I have some work ahead of me. I found the
>> example, but examples rarely contain all edges and corners.
>>
>> Dan
>>
>> On Sat, 2012-09-08 at 03:39 +0000, Fields, Christopher J wrote:
>>> Re: Genbank, the only know specification I know of is for the feature
>>> table portion of the format as you have below. They do have a
>>> (possibly out of date) example file, note it isn't easily found unless
>>> you search for it:
>>>
>>> http://www.ncbi.nlm.nih.gov/Sitemap/samplerecord
>>>
>>> EMBL is better in this regard:
>>>
>>> http://www.ebi.ac.uk/embl/Documentation/User_manual/usrman.html
>>>
>>> Note that UniProt Knowledgebase also has a user manual outlining the
>>> similarities and differences with EMBL:
>>>
>>> http://web.expasy.org/docs/userman.html
>>>
>>> chris
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
More information about the Bioperl-l
mailing list