[Biopython] Creating GenBank files
    Peter Saffrey 
    pzs at dcs.gla.ac.uk
       
    Wed Sep 16 16:14:20 UTC 2009
    
    
  
Peter wrote:
> Yes, you must create a SeqRecord object with suitable SeqFeature objects,
> and then write it out with SeqIO in GenBank format. If all your features have
> trivial locations, this is pretty easy.
> 
Thanks for this. I've managed to get this to work, but encountered a few 
minor issues.
I already have GenBank files created by CLC Genomics Workbench 3 but I 
want to make these in a script. The CLC generated GenBank files look 
like this:
LOCUS       Setd2-tagged           11750 bp    DNA     linear   UNA
FEATURES             Location/Qualifiers
      misc_feature    1..50
                      /label="Subcloning HA Upstream"
...(snip other features)
ORIGIN
         1 TTGGTGTGAG CTCTTTGTGT CTTGCCTAAG TATGTGCATC TGTCTTGTCT
...(snip sequence)
To do this in biopython, I need to create my feature thus:
sf = SeqFeature.SeqFeature(SeqFeature.FeatureLocation(0,50), 
type="misc_feature", qualifiers = { "label" : [ "Subcloning HA Upstream" ]})
The issues I had were:
- In the docstring for SeqFeature, it says the attribute is "qualifier" 
but it should be "qualifiers".
- My first stab at the qualifiers argument was to do
qualifiers = { "label" : "mylabel" }
but if I do that, it iterates over "mylabel" giving me one "label" for 
each character! Maybe the qualifier printer should check it's being 
given a list and not a string?
- I'd like to remove some of the extraneous header from the GenBank file:
DEFINITION  .
ACCESSION   <unknown id>
VERSION     <unknown id>
KEYWORDS    .
SOURCE      .
   ORGANISM  .
             .
Is this possible?
Sorry for the long message,
Peter
    
    
More information about the Biopython
mailing list