Bioperl: Re: feature parsing for GenBank/EMBL
Hilmar Lapp
hlapp@gmx.net
Mon, 08 May 2000 12:32:23 +0200
There's a documentation of the feature table format at the NCBI website (URL
http://www.ncbi.nlm.nih.gov/collab/FT/index.html). Locations in particular are
documented at http://www.ncbi.nlm.nih.gov/collab/FT/index.html#location
A couple of these are not covered (i.e., fail, but ignored after a warning)
presently by the feature table parsing methods, and some are even not really
covered by the SeqFeatureI interface, like (quoted from the URL)
(23.45)..600 Specifies that the starting point is one of the bases be-
tween bases 23 and 45, inclusive, and the end point is
base 600
(122.133)..(204.221) The feature starts at a base between 122 and 133, in-
clusive, and ends at a base between 204 and 221, in-
clusive
145^177 Points to a site between two adjacent bases anywhere
between bases 145 and 177
order(location,location, ... location)
The elements can be found in the specified order (5' to 3' direction),
but nothing is implied about the reasonableness about joining them
J00194:(100..202) Points to bases 100 to 202, inclusive, in the entry (in
this database) with primary accession number
'J00194'
Do you see a point in having 'wobble' information for start and end in the
SeqFeatureI interface, or in an implementation module?
I think just saying we don't let us govern by GenBank parsing issues
(actually, it's a joint definition for GenBank/EMBL/DDBJ) may not be the best
answer, because the feature annotation rules obviously reflect the biological
knowledge we have at present, and I think that's what we are trying to model,
at least to some extent.
Just a few thoughts off the top of my head.
Cheers,
Hilmar
--
-----------------------------------------------------------------------
Hilmar Lapp email: hlapp@gmx.net
NFI Vienna, IFD/Bioinformatics phone: +43 1 86634 631
A-1235 Vienna fax: +43 1 86634 727
ROI: Bioinformatics (arrays, expression, seqs), Programming, Databases,
Mountain Biking (hard tail, hard fork: feel the trail)
-----------------------------------------------------------------------
=========== Bioperl Project Mailing List Message Footer =======
Project URL: http://bio.perl.org/
For info about how to (un)subscribe, where messages are archived, etc:
http://www.techfak.uni-bielefeld.de/bcd/Perl/Bio/vsns-bcd-perl.html
====================================================================