[EMBOSS] fuzznuc and sequence ID
David Bauer
david.bauer at bayer.com
Mon Jun 10 11:11:21 UTC 2013
Dear Philippe,
the problem with this fasta header line is the ":".
EMBOSS interprets anything before the ":" as name of a database where this sequence comes from and does not treat this as part of the sequence name.
http://emboss.sourceforge.net/docs/themes/UniformSequenceAddress.html
If you replace the ":" in your input file by e.g. "_" you will get the full sequence name in the output of fuzznuc (or any other EMBOSS program).
HTH,
David.
-----Ursprüngliche Nachricht-----
Von: emboss-bounces at lists.open-bio.org [mailto:emboss-bounces at lists.open-bio.org] Im Auftrag von Philippe DESSEN
Gesendet: 10 June 2013 12:29
An: emboss at emboss.open-bio.org
Betreff: [EMBOSS] fuzznuc and sequence ID
Dear all,
I use fuzznuc to find some patterns in an extract of the human genome as a fasta file with several parts :
>chr1:562520-566670
GGAGTGGTAGCTCTCAGTATAGTCAGCCTCTAAGAAGAGAGCAAATGTTT
ATTTTCAAGAAGAATTATGCAGAAAGGGCCACTTTCAGTCTACCATCCCC
CCAGATTCCTTGAAGGCAGGATGATGTGAGCAGCAAGGGAAGAAAGGGGA
GTGGGCACGAAATACTACAGAACCTGCAGGGAACGAAGTCCCTCTGTCTG
;..
>
Curiously the report of fuzznuc (default) is like that without the same identifier as on the fasta file
########################################
# Program: fuzznuc
# Rundate: Mon Jun 10 2013 11:56:52
# Commandline: fuzznuc
# [-sequence] xxxx.fdr01peaks.hg19.fasta
# -pattern xxxxxxx
# -complement
# -outfile fuzz.txt
# Report_format: seqtable
# Report_file: _fuzz.txt
########################################
#=======================================
#
# Sequence: Sequence: 562520-566670 from: 1 to: 4151
# HitCount: 0
#
#
It is not possible to localize the segment on the full genome without the chromosome !!
--
Best
Philippe Dessen
IGR, Villejuif, France
_______________________________________________
EMBOSS mailing list
EMBOSS at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/emboss
More information about the EMBOSS
mailing list