[Biopython] losing information
Liam Thompson
dejmail at gmail.com
Thu Oct 29 04:53:32 UTC 2009
hi everyone
I'm running a simple script to remove genbank records from a GB file
that I have indentified as undesirable. The only
problem is that when the script is run, all the annotation info (CDS
etc) for entries is lost, only the sequence and ID is kept.
I was wondering if there is an option I am missing, or if I am using
an incorrect variable type somewhere. I just
can't seem to get all the info written.
from Bio import SeqIO
outhandle = open("HBV_seqs.gb", "w")
inhandle = open("all_hbv_seqs_reannotated.gb", "rU")
newrecords = []
badlist = list(open("deletionrecords.txt", "rU"))
badrecord=[]
for items in badlist:
badrecord.append(items[:-1])
for record in SeqIO.parse(inhandle, "genbank"):
if record.name not in badrecord:
newrecords.append(record)
print "writing records..."
SeqIO.write(newrecords, outhandle, "genbank")
print "writing done"
outhandle.close()
I would appreciate any pointers.
Thanks
Liam
More information about the Biopython
mailing list