[EMBOSS] meaning of "(Reversed)" in diffseq?
Francoeur, Joseph A.
jfrancoe at mitre.org
Wed Mar 25 21:15:13 UTC 2009
Thanks, Andres, but I don't think that explains it. The label "(Reversed)" only appears for selected output lines in the diffseq report. The documentation you cited just states that the reverse ordering of the lines always occurs in the report.
From: Andres Pinzon [mailto:andrespinzon at gmail.com]
Sent: Wednesday, March 25, 2009 5:04 PM
To: Francoeur, Joseph A.
Cc: emboss at emboss.open-bio.org
Subject: Re: [EMBOSS] meaning of "(Reversed)" in diffseq?
I extracted this from the EMBOSS documentation, perhaps it'll help you (specially the las paragraph):
Each report consists of 4 or more lines.
* The first line has the name of the first sequence followed by the start and end positions of the mismatched region in that sequence, followed by the length of the mismatched region. If the mismatched region is of zero length in this sequence, then only the position of the last matching base before the mismatch is given.
* If a feature of the first sequence overlaps with this mismatch region, then one or more lines starting with 'Feature:' comes next with the type, position and tag field of the feature.
* Next is a line starting "Sequence:" giving the sequence of the mismatch in the first sequence.
This is followed by the equivalent information for the second sequence, but in the reverse order, namely 'Sequence:' line, 'Feature:' lines and line giving the position of the mismatch in the second sequence.[...]"
On Wed, Mar 25, 2009 at 4:47 PM, Francoeur, Joseph A. <jfrancoe at mitre.org<mailto:jfrancoe at mitre.org>> wrote:
I ran diffseq on 2 FASTA-formatted DNA sequence files on my local installation of EMBOSS, and I have entries in the .diffseq output file labeled as "(Reversed)". Looking at both sequences in an editor, neither of these sequence segments is reversed (the "Sequence" string in the diffseq file matches the original FASTA file). Does anyone know what is meant by this? I'm analyzing the EMBOSS source code, and it looks as though it's related to a strand feature. That's puzzling me, because my local installation has no databases installed, and it seems that this kind of information wouldn't be embedded in the FASTA file. How does diffseq determine this strand information, and how should I interpret it?
Joseph A. Francoeur
Senior Software Systems Engineer
The MITRE Corporation
202 Burlington Road MS K228
Bedford, MA 01730-1420
e-mail: jfrancoe at mitre.org<mailto:jfrancoe at mitre.org><mailto:jfrancoe at mitre.org<mailto:jfrancoe at mitre.org>>
EMBOSS mailing list
EMBOSS at lists.open-bio.org<mailto:EMBOSS at lists.open-bio.org>
Bioinformatics Center, Colombia EMBnet node
Tel +57 3165000 ext 16961 Fax +571 3165415
Micology and Phytopathology Laboratory - Los Andes University.
Tel +571 3394949 ext. 2768
More information about the EMBOSS