Antwort: [EMBOSS] seqret options

David.Bauer at SCHERING.DE David.Bauer at SCHERING.DE
Wed Jun 15 11:19:55 UTC 2005


Hi Derek,

you can easily change this in the source code.
The sequence output formats are defined in ajax/ajseqwrite.c
In the function seqWritePhylip3 you find a line:
ajFmtPrintF(outseq->File, "1 %d YF\n", ilen);
Here you can just delete the YF and recompile emboss.

David.



                                                                                                                                 
                      Derek Gatherer                                                                                             
                      <d.gatherer at vir.                                                                                           
                      gla.ac.uk>               An:      emboss at embnet.org                                                        
                      Gesendet von:            Kopie:                                                                            
                      owner-emboss at hgm         Thema:   [EMBOSS] seqret options                                                  
                      p.mrc.ac.uk                                                                                                
                                                                                                                                 
                                                                                                                                 
                      15.06.2005 12:31                                                                                           
                                                                                                                                 
                                                                                                                                 




Dear EMBOSSers

I'm trying to write a pipeline to take a load of paired, aligned homologues

from 2 species and submit them sequentially to the yn00 application from
the well known PAML package.  PAML's applications all take PHYLIP
format.  I can easily make this by looping over:

seqret -auto -osformat phylip infile -out outfile

However, PAML requires that the flag "I" be placed on the top line of the
phylip fomat to indicate interleaved, eg:

  2 663 I
c-barf1  ATGGCCAGGC TTTTCGCTCA GCTGCTCCTG CTCGCGGGCT CCGTCGCCTC
barf1     ATGGCCAGGT TCATCGCTCA GCTCCTCCTG TTGGCCTCCT GTGTGGCCGC

           CTGCCTGGCC GTCACCGCCT TTGTGGGTGA GCGGGCCGTC CTGAGTTCCT
           CGGCCAGGCT GTCACCGCTT TCTTGGGTGA GCGAGTCACC CTGACCTCCT

rather than the standard phylip format, given by seqret:

  2 663
c-barf1   ATGGCCAGGC TTTTCGCTCA GCTGCTCCTG CTCGCGGGCT CCGTCGCCTC
barf1     ATGGCCAGGT TCATCGCTCA GCTCCTCCTG TTGGCCTCCT GTGTGGCCGC

           CTGCCTGGCC GTCACCGCCT TTGTGGGTGA GCGGGCCGTC CTGAGTTCCT
           CGGCCAGGCT GTCACCGCTT TCTTGGGTGA GCGAGTCACC CTGACCTCCT

I could write a script to open each seqret output file and add this
character to the top line of each, but before I dive into this, I'd like to

know if there is any flag I can add to seqret to get the "I" added
automatically.

Failing that, PAML takes the other, non-interleaved phylip format
("sequential") by default, and that would not require any flag
insertion.  Seqret also can produce this (using -osformat phylip3):

1 663 YF
c-barf1 ATGGCCAGGC TTTTCGCTCA GCTGCTCCTG CTCGCGGGCT CCGTCGCCTC
           CTGCCTGGCC GTCACCGCCT TTGTGGGTGA GCGGGCCGTC CTGAGTTCCT
           ACTGGAAGAG GGTGAGCCTA GGGCCCGAGA TCATGGTGGA ATGGTTCAAA

but then PAML won't read it because it doesn't like the YF flags inserted
by seqret!!

So I either have to script to remove flags from sequential or insert them
in interleaved, unless seqret has a solution.

All assistance gratefully appreciated
Derek








More information about the EMBOSS mailing list