[EMBOSS] Match mass sequences for mass sequences
Peter Rice
pmr at ebi.ac.uk
Tue Nov 20 08:58:21 UTC 2007
JEEYOUNG LEE wrote:
> Dear Sir
>
> I'm sorry for a perhaps naive question.
> I want to align sequences of 1000 pairs. For example, "A" file
> includes 1000 sequences and "B" file includes 1000 sequences and two
> file will be compared. I'd like to find certain sequence( X gene) of A
> file which have high sequence similarity with some sequence ( X' gene)
> in B file. Then, certain gene (Y) in "A" file will be matched with Y'
> gene which have high identity in B file. Finally, I want to get
> matched 1000 pairs and their identity score. At one time, can I match
> mass sequences using Jemboss? How can I handle this problem?
In EMBOSS 5.0.0 the wordfinder program is designed to do this. It uses a
word-based algorithm (n consecutive identical bases) and then aligns
using a limited window size. One warning ... the alignment includes the
original word match, which may (in low identity cases) not be the
highest alignment score.
Wordfinder has additional options to select the matches you want.
Older EMBOSS releases had only supermatcher which is less sophisticated
in selecting matches.
Hope that helps
Peter Rice
More information about the EMBOSS
mailing list