[emboss-dev] EMBOSS and its FASTA like alignment output

Peter biopython at maubp.freeserve.co.uk
Tue Jul 21 10:52:19 UTC 2009


One of the many things I talked to Peter Rice about in Sweden
was the Pearson FASTA like output from needle and water (e.g.
what EMBOSS calls the markx10 output format), and why it
includes the EMBOSS header and footer lines (which start with
a # character), which are not present in real FASTA output.

Biopython can parse the pairwise -m 10 output from Bill
Pearson's FASTA tools, so in theory we (Biopython) should
be able to parse the markx10 output from EMBOSS needle
and water. We could probably cope with the extra header
and footer, but I think it would be best if EMBOSS could
produce something more closely matching the real FASTA
output. Unfortunately, it appears to be more than just the
headers which upset our parser - even ignoring them,
EMBOSS markx10 output still looks rather different to
(current) FASTA -m 10 output. Was the markx10 output
mimicking a particular (old) version of the FASTA tools?


Peter R. did say it would be simple to turn off this header and
footer output, so I thought I would try this myself. It looks like
this is handled in file ajax/ajalign.c by function alignWriteMark,
but I don't see a switch to disable the headers and footers.

>From looking at other writers, to disable the header, I think I
just need to replace this line in alignWriteMark:



	/* turn off printing of the header, keep the calculation */
	thys->File = NULL;
	thys->File = outf;

I have worked out the footer gets printed by ajAlignWriteTail,
but am unclear on where this is called by alignWriteMark.
The only place that seems to call it is ajAlignClose, and this
calls ajAlignWriteTail unconditionally.


Peter C.

More information about the emboss-dev mailing list