[Biopython-dev] [Bug 1762] Bio.GenBank.FeatureParser dislikes valid accessions and locus lines

bugzilla-daemon at portal.open-bio.org bugzilla-daemon at portal.open-bio.org
Mon Mar 14 09:27:56 EST 2005


http://bugzilla.open-bio.org/show_bug.cgi?id=1762





------- Additional Comments From jtk at cmp.uea.ac.uk  2005-03-14 09:27 -------
Created an attachment (id=201)
 --> (http://bugzilla.open-bio.org/attachment.cgi?id=201&action=view)
patch to fix the two issues mentioned in the bug report

The patch addresses the locus line problem by making the parts that
are missing in seqret generated files optional in the regexp, and the
accession issue by allowing the "-" character in accession numbers.

The patch only modifies the Bio/expressions/genbank.py file. Changes
are marked by comments containing "JTK"

The patch allows me to work with the files in question and does not
introduce new problems or regression test failures.

I haven't checked whether missing division / length / DNA/RNA/protein / 
circular/linear information results in appropriate defaults in the
objects created by parsing. As long as the corresponding members are
not used, there should not be any problem.

Regarding the accession, it seems to me that a proper representation of
things like "AE000111-AE000510" might even necessitate a more differentiated
approach to representing accessions (introducing something like ranges),
but again, as long as the accessions are not used for further computation,
there should not be a problem.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.


More information about the Biopython-dev mailing list