[Biopython] Sequence alignment with multiple proteins
    Cymon Cox 
    cy at cymon.org
       
    Thu May 14 17:29:47 UTC 2009
    
    
  
Hi Michael,
2009/5/14 Fahy, Michael <fahy at chapman.edu>
> This is not strictly a BioPython question but I'm using BioPython for
> the work.
>
> I have a set of 45 proteins and 10 species.  I have a  representative
> orthologous protein from each set for each of the 10 species.  I'm
> trying to build a phylogenetic tree by aligning the data from the 10
> species.  I've tried concatenating the 45 protein sequences for each of
> the 10 species and aligning the concatenated sequences but this has
> produced results that do not make sense.  What do you recommend for such
> a problem?
The way I (and I suspect most others) approach this is to align each protein
data individually (ie you'll have 45 separate protein alignments) and then
concatenated them into one super-matrix.
Currently, Bio.AlignIO does not support column to column concatenation of
data. But by happy coincidence, David Winter, posted today that he has
included a cookbook example of how to combine alignments using the Bio.Nexus
interface - you can find the example here:
http://biopython.org/wiki/Concatenate_nexus
If you alignment viewer does not support export in Nexus format, you can use
Bio.AlignIO to convert the alignment to Nexus.
Cheers, Cymon
--
    
    
More information about the Biopython
mailing list