[BioPython] EMBOSS programs and their alignment formats
Peter (BioPython)
biopython at maubp.freeserve.co.uk
Tue Mar 21 12:30:16 UTC 2006
I've been having a look at BioPython's Emboss support and it looks like
a (partial) set of command line interfaces to the tools, with additional
code for some of the primer tools and their formats.
As far as I can tell, there is no support for any of the Emboss
alignment output formats:
http://emboss.sourceforge.net/docs/themes/AlignFormats.html
Some (all?) of the alignment programs will happily produce gapped FASTA
output, but this excludes other information like the alignment score
etc. The alignments themselves could be analysed to extract the
alignment length, identity, similarity and gap counts.
However, the FASTA format does not include the algorithm specific score,
nor other program parameters which might be of interest (like the matrix
and gap penalties).
e.g.
########################################
# Program: demoalign
# Rundate: Thu Jan 17 09:30:08 2002
# Report_file: stdout
########################################
#=======================================
#
# Aligned_sequences: 4
# 1: IXI_234
# 2: IXI_235
# 3: IXI_236
# 4: IXI_237
# Matrix: EBLOSUM62
# Gap_penalty: 9
# Extend_penalty: -1
#
# Length: 131
# Identity: 95/131 (72.5%)
# Similarity: 127/131 (96.9%)
# Gaps: 25/131 (19.1%)
#
#
#=======================================
(followed by the aligned sequences)
Has anyone tackled supporting these files in BioPython?
Thanks
Peter
More information about the Biopython
mailing list