[BioPython] [Bioperl-l] How to draw a plasmid map from a genbank-formatted file?
Chris Fields
cjfields at uiuc.edu
Mon Jun 25 16:48:30 UTC 2007
Martin,
Keep bioperl-related discussion on the bioperl mail list. The large
majority of this isn't biopython-related, but maybe some devs there
can add to this?
On Jun 25, 2007, at 11:05 AM, Martin MOKREJŠ wrote:
...
> Would you please tell me exactly what is wrong with the spacing?
Here's a section of the seq record attached to your previous email:
DEFINITION .
ACCESSION .
VERSION .
SOURCE .
ORGANISM .
Normally there is a fixed column width for any data present in a
field, so it would look more like this:
DEFINITION PYR4 (DIHYDROOROTASE, PYRIMIDIN 4, dihydroorotase);
dihydroorotase
[Arabidopsis thaliana].
ACCESSION NP_194024
VERSION NP_194024.1 GI:15235865
DBSOURCE REFSEQ: accession NM_118422.3
KEYWORDS .
SOURCE Arabidopsis thaliana (thale cress)
ORGANISM Arabidopsis thaliana
Eukaryota; Viridiplantae; Streptophyta; Embryophyta;
Tracheophyta;
Spermatophyta; Magnoliophyta; eudicotyledons; core
eudicotyledons;
rosids; eurosids II; Brassicales; Brassicaceae;
Arabidopsis.
Here's the relevant bit in the latest release notes:
"The second part of each sequence entry record contains the information
appropriate to its keyword, in positions 13 to 80 for keywords and
positions 11 to 80 for the sequence."
The bioperl devs try to make our parsers as flexible as possible but
others may not, so it's something in ApE that should probably be
fixed. And as mentioned to you several times in the past on the mail
list and on bugzilla, don't expect sequence records which sway from
the standard (in this case, the release notes) to parse correctly in
all cases. We can try supporting some that sway from that standard
but only up to a point. If it causes additional bugs, headaches, or
degrades performance it won't be supported.
> ...
> Well, I just copy&pasted the script from the bioperl webpages, I think
> from a tutorial or FAQ, don't remember anymore.
Well, can't help you if you can't point out where the code originated
from. We would like to know so it can be corrected.
> ...
> Well, my search for such tools available on Unix to be used in a
> script,
> non-interactively, completely failed. My last hope except getting
> improved
> ApE is to use the GenomeDiagram under biopython, but so far my .gb
> files
> cannot be parsed yet. :(
> Martin
As mentioned previously you will likely have to code for it yourself
(perl or python) or help debug the relevant biopython code to get it
working. We can't/won't do this for you unless/until it's something
we feel warrants implementation. Judging by the bug list, we also
haven't the time nor inclination to code for it. Sorry but we have
other priorities besides doing your work for you.
chris
More information about the Biopython
mailing list