[EMBOSS] matcher output
Peter Rice
pmr at ebi.ac.uk
Fri Jan 30 14:59:27 UTC 2004
Karin Lagesen wrote:
> I am trying to use matcher to align some sequences. I would however
> like to get the whole sequences outputted. I think that this is what
> the option -aglobal3 is for. I have tried
>
> matcher teste testb -outfile test.matcher -aglobal3 1
>
> Does -aglobal3 do what I think it does, or isn't there an option to
> matcher that does what I want it to do? In that case, is there a
> different program out there which does what I need (i.e. a local
> alignment of two sequences where the rest of the sequences are matched
> together too).
Ouch ... local and global alignments are quite different (in biological
terms, and in what the program has stored after the calculation).
With a global alignment, the ends of the sequences are included in the
output (ends beyond the overlap with the other sequence). EMBOSS should
allow you to turn off these ends with -noaglobal. However ... checking
the code shows that the most recent rewrite of the alignment display
code has lost this capability. It will be restored for 2.9.0 because for
tasks like finding an exon in a genomic sequence there can be a lot of
extra bases to be displayed in one of the sequences.
For local alignments, there is a good reason why you cannot display the
rest ... it has not been aligned.
So, what are the alternatives?
matcher will not align the whole sequences, but it does implement the
Waterman-Eggert algorithm for next-best local alignment, so you can run
"matcher -alternatives 10" to find the best 10 local alignments. Very
helpful with multi-domain proteins, or matching mRNA/cDNA/EST to genomic
sequence.
supermatcher will do a word-based alignment and try to align the
remaining regions - but a word-based alignment requires perfect identity.
Someone (us, for example :-) could try to write an iterative alignment
that realigns the remaining regions of sequence with further local
alignments but I doubt whether it would be really useful.
Hope this helps,
Peter Rice
More information about the EMBOSS
mailing list