[Biopython] change rec.id problems
Frederico Moraes Ferreira
ferreirafm at usp.br
Mon Jun 24 21:53:17 UTC 2013
Hi list,
I'm trying to change the rec.id as so the file name replaces the
beginning id string itself.
The code is as follows:
for inf in inflist:
rec = SeqIO.read(open(inf, "rU"), "fasta")
if inf[:-5] != rec.id.split('|')[0][:-3]:
print rec.id
rec.id = '%spep|%s' % (inf[:-5],
'|'.join(rec.id.split('|')[1:]))
print rec.id
outf = '.'.join(inf.split('.')[:-1]) + '_new.fasta'
SeqIO.write(rec, outf, 'fasta')
Judging by the prints bellow, the program seems to be working fine.
####output########
emm52.pep|166|Type:P
emm52.0.pep|166|Type:P
emm5-21.pep|178|Type:P
emm5.21.pep|178|Type:P
emm52-1.pep|240|Type:P
emm52.1.pep|240|Type:P
emm5-22.pep|219|Type:P
emm5.22.pep|219|Type:P
emm5-23.pep|231|Type:P
emm5.23.pep|231|Type:P
emm5-24.pep|157|Type:P
emm5.24.pep|157|Type:P
emm5-25.pep|110|Type:P
However, in the file the new and old ids were concatenated.
>emm52.0.pep|166|Type:P emm52.pep|166|Type:P <unknown description>
GTASVAVGLTVVGAGLASQTEVKADQPVDHHRYTEANDAVLQGRTVSARALLHEINKNGQ
LRSENEELKADLQKKEQELKNLNDDVKKLNDEVALERLKNERHVHDEEVELERLKNERHD
HDKKEAERKALEDKLADKQEHLDGALRYINEKEAERKEKEAEQKKL
Am I doing something wrong?
All the best,
Fred
More information about the Biopython
mailing list