[Biopython] Concatenate to aligned sequences
Vincent Davis
vincent at vincentdavis.net
Thu Feb 14 17:20:58 UTC 2013
I have 2 fasta files from a mucle alignment. Both have the same number of
sequences from the same organism. If I what to concatenate the pairs of
sequences what it the best way to do this.
Right now I am doing this:
def concatenate(fa1, fa2):
fa1open = open(fa1, "rU")
fa2open = open(fa1, "rU")
fa1dict = SeqIO.to_dict(SeqIO.parse(fa1open, "fasta"))
fa2dict = SeqIO.to_dict(SeqIO.parse(fa2open, "fasta"))
fa1open.close()
fa2open.close()
# check that both files have the same sequnce id's
if set(fa1dict.keys()) != set(fa2dict.keys()):
print(fa1dict.keys(), fa2dict.keys())
print('The fasta files do not have the same sequences')
bothdict = {}
bothlist = []
count = 1
for key in fa2dict.keys():
bothdict[key] = fa2dict[key]
bothdict[key].seq = fa2dict[key].seq + fa1dict[key].seq
bothlist.append(bothdict[key])
return bothdict, bothlist
Vincent Davis
720-301-3003
More information about the Biopython
mailing list