[EMBOSS] display of long ensembl and vega identifiers in alignments
Hans Rudolf Hotz
hrh at sanger.ac.uk
Fri Aug 11 12:54:09 UTC 2006
Hi
ensembl and vega identifiers are very long, and are therefore cut when
used in alignment programs like matcher, eg:
cbi1b[hrh]59: matcher pep1 pep2 stdout
Finds the best local alignments between two sequences
########################################
# Program: matcher
# Rundate: Fri Aug 11 2006 13:45:51
# Align_format: markx0
# Report_file: stdout
########################################
#=======================================
#
# Aligned_sequences: 2
# 1: OTTHUMT00000072262
# 2: ENST00000216277
# Matrix: EBLOSUM62
# Gap_penalty: 14
# Extend_penalty: 4
#
# Length: 745
# Identity: 745/745 (100.0%)
# Similarity: 745/745 (100.0%)
# Gaps: 0/745 ( 0.0%)
# Score: 3818
#
#
#=======================================
10 20 30 40 50
OTTHUM MPFPVTTQGSQQTQPPQKHYGITSPISLAAPKETDCVLTQKLIETLKPFG
::::::::::::::::::::::::::::::::::::::::::::::::::
ENST00 MPFPVTTQGSQQTQPPQKHYGITSPISLAAPKETDCVLTQKLIETLKPFG
10 20 30 40 50
A few months back, I played arround with the source code and changed one
of the library files (ajalign.c). This now allows the display of up to 20
characters, by using a new output format "pairln" for sequence alignment
programs, like matcher or needle. This is in comparison to the default
which displays only the first 6 characters, or "pair" which displays the
first 13 characters, eg:
cbi1b[hrh]65: matcher pep1 pep2 stdout -aformat pairln
Finds the best local alignments between two sequences
########################################
# Program: matcher
# Rundate: Fri Aug 11 2006 13:49:41
# Align_format: pairln
# Report_file: stdout
########################################
#=======================================
#
# Aligned_sequences: 2
# 1: OTTHUMT00000072262
# 2: ENST00000216277
# Matrix: EBLOSUM62
# Gap_penalty: 14
# Extend_penalty: 4
#
# Length: 745
# Identity: 745/745 (100.0%)
# Similarity: 745/745 (100.0%)
# Gaps: 0/745 ( 0.0%)
# Score: 3818
#
#
#=======================================
OTTHUMT00000072262 1
MPFPVTTQGSQQTQPPQKHYGITSPISLAAPKETDCVLTQKLIETLKPFG 50
||||||||||||||||||||||||||||||||||||||||||||||||||
ENST00000216277 1
MPFPVTTQGSQQTQPPQKHYGITSPISLAAPKETDCVLTQKLIETLKPFG 50
Any chance something like this could make it into the distributed code?
Thanks, Hans
More information about the EMBOSS
mailing list