[emboss-dev] Enquiry about emboss needle program customizing
Peter Rice
pmr at ebi.ac.uk
Tue Jul 19 05:50:00 UTC 2011
On 19/07/2011 03:00, Tae-Kyung Kim wrote:
> I am now working for Korea Bioinformation Center, and particularly
> integrating Korean Bio-resources as National infrastructure.
>
> Now, I am urgently customizing global alignment program needle
I think we can help. I believe EMBOSS can already do what you need.
> First, I would like to get standard input sequence parameter instead of
> file name like this.
>
> ./needle -asequence ATGCATATAAA -bsequence ATAGAATAAA -gapopen 10
> -gapextend 0.5 -stdout -auto
All EMBOSS programs can do this. EMBOSS has a file format "asis" which
defines the file name as the sequence, so your command line becomes:
./needle -asequence asis::ATGCATATAAA -bsequence asis::ATAGAATAAA
-gapopen 10 -gapextend 0.5 -stdout -auto
For long sequences your system needs to allow long command lines.
EMBOSS has no restriction on the length of the asis:: sequence but
sometimes the system gives an error message or truncates the command
line. For long sequences you can, of course, save the sequence to a file
and use the filename as input.
> Second, I would like to get real identity, which means the number
> of '|' in needle result.
>
> ATAAAAAA
> | | | | | | |
> ATAATAAA
> ---------------------------------------
> Real Identity = 7
Needle reports this information ... if you look in the header of the
output you will find an Indentity: line which is the number of positions
with a '|' in the output.
#=======================================
#
# Aligned_sequences: 2
# 1: asis
# 2: asis
# Matrix: EDNAFULL
# Gap_penalty: 10.0
# Extend_penalty: 0.5
#
# Length: 14
# Identity: 7/14 (50.0%)
# Similarity: 7/14 (50.0%)
# Gaps: 7/14 (50.0%)
# Score: 24.0
#
#
#=======================================
asis 1 ATGCAT---ATAAA 11
|| |||||
asis 1 ----ATAGAATAAA 10
You can also select an alternative output format with the -aformat
qualifier (alignment format). Most of the alignment formats also include
this header.
A list of alignment formats can be found on our website at
http://emboss.open-bio.org/html/use/ch05s04.html
(This is chapter 5.4 of the new EMBOSS User's Guide book)
Hope this helps.
Peter Rice
EMBOSS Team
More information about the emboss-dev
mailing list