[BioPython] alignment processing
    cgw501 at york.ac.uk 
    cgw501 at york.ac.uk
       
    Tue May 17 15:07:03 EDT 2005
    
    
  
Hi,
I have a file processing task I'm trying to do with biopython. I have to 
take a bunch of clustal alignment files that cover one arm of a whole 
chromosome, strip off the lowercase letters at the end of each sequence, 
and produce a file containing all the stripped sequences together is fasta 
format. This is what I have so far:
import Bio.Clustalw
from Bio.Alphabet import IUPAC
import string
from Bio.Seq import Seq
from Bio.SeqIO import FASTA
from Bio.SeqRecord import SeqRecord
from sys import *
import sys
inputs = sys.argv[1:-2]
output = open(sys.argv[-1], 'w')
for f in inputs:
    align = Bio.Clustalw.parse_file(f, alphabet=IUPAC.ambiguous_dna)
    lines = align.get_all_seqs()
    strippedAlignRecord = []
    for line in lines:
        lineSeq = line.seq
        lineString = lineSeq.tostring()
        strippedSeq = lineString.rstrip('atcg-')
        strippedSeqObj = Seq(strippedSeq, IUPAC.ambiguous_dna)
        strippedRecObj = SeqRecord(strippedSeqObj, id = line.description)
        out = FASTA.FastaWriter(output)
        out.write(strippedRecObj)
When I run this from the command line I don't get any errors, but the 
outfile is not created. I'm a bit flummoxed. Any ideas?
Thanks,
Chris
    
    
More information about the BioPython
mailing list