[Biopython-dev] clustalw question

Brad Chapman chapmanb at arches.uga.edu
Sun Oct 15 03:35:06 EDT 2000


Cayte wrote:
>  clustal_format.py only allows asterisks and spaces in the last line
>  of an alignment.  I just ran an alignment from:
>  
>  http://www2.ebi.ac.uk/clustalw/
>  
>  The equivalent line contained colons and periods, too.

Thanks for trying it out, and thanks for the catch! I'll happily fix it
to accept this output.

>  The regexp is 
>  
>  match_stars = Martel.Group("match_stars",
>			   Martel.Re("[ \*]+") +
>			   Martel.Opt(Martel.Str("\n")))

So, for a quick fix, you can change the second line to:

Martel.Re("[ :\*\.]+")

>  I'll send the output if you like.

Please do, and I'll add it to the test suite and fix the parser. I just
poked around a bit to see what that line actually means, and starss are
identical residues, colons are conserved substitutions and periods are
semi-conserved substitutions. Neat! I never saw these since I have been
using Clustalw to align nucleic acids and not proteins.

Thanks again for catching this!

Brad








More information about the Biopython-dev mailing list