[Biopython-dev] [Bug 2446] Comments in CT tags cause Bio.Sequencing.Ace.ACEParser to fail.

bugzilla-daemon at portal.open-bio.org bugzilla-daemon at portal.open-bio.org
Fri Aug 1 13:41:44 UTC 2008


http://bugzilla.open-bio.org/show_bug.cgi?id=2446





------- Comment #5 from mdehoon at ims.u-tokyo.ac.jp  2008-08-01 09:41 EST -------
Some information about these comment blocks from the polyphred developers:

---------------
They are intentional, though I'm not sure they are limited to
Polyphred's tags.

The format that I have typically seen is more like this:

CT{
Contig1 repeat phrap 52 53 555456:555432
COMMENT{
First line.
Second line.
C}
}

Specifically, the CT block always seems to end with the regex '^}$' and
the COMMENT block always ends with '^C}$'. I assume the literal 'C' was
added on the assumption that non-COMMENT-aware parsers would always be
looking for the brace at the beginning of the line. It's not exactly a
C-like, flexible-whitespace format.

In Consed (13.95 Beta; don't ask) adding a tag with a comment produces
this format in the ACE file. I don't know whether this has been changed
in later versions.

Admittedly, the latest Consed documentation does not mention this style.

Since (at least some versions of) Consed produce comments in this style
in addition to Polyphred, I recommend that the BioPython parser be
adjusted to accept either one.

----

I guess we should store any COMMENT block in a new 'comment' member of the ct
class.


-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.



More information about the Biopython-dev mailing list