[EMBOSS] seqret options

Derek Gatherer d.gatherer at vir.gla.ac.uk
Wed Jun 15 10:31:33 UTC 2005


Dear EMBOSSers

I'm trying to write a pipeline to take a load of paired, aligned homologues 
from 2 species and submit them sequentially to the yn00 application from 
the well known PAML package.  PAML's applications all take PHYLIP 
format.  I can easily make this by looping over:

seqret -auto -osformat phylip infile -out outfile

However, PAML requires that the flag "I" be placed on the top line of the 
phylip fomat to indicate interleaved, eg:

  2 663 I
c-barf1  ATGGCCAGGC TTTTCGCTCA GCTGCTCCTG CTCGCGGGCT CCGTCGCCTC
barf1     ATGGCCAGGT TCATCGCTCA GCTCCTCCTG TTGGCCTCCT GTGTGGCCGC

           CTGCCTGGCC GTCACCGCCT TTGTGGGTGA GCGGGCCGTC CTGAGTTCCT
           CGGCCAGGCT GTCACCGCTT TCTTGGGTGA GCGAGTCACC CTGACCTCCT

rather than the standard phylip format, given by seqret:

  2 663
c-barf1   ATGGCCAGGC TTTTCGCTCA GCTGCTCCTG CTCGCGGGCT CCGTCGCCTC
barf1     ATGGCCAGGT TCATCGCTCA GCTCCTCCTG TTGGCCTCCT GTGTGGCCGC

           CTGCCTGGCC GTCACCGCCT TTGTGGGTGA GCGGGCCGTC CTGAGTTCCT
           CGGCCAGGCT GTCACCGCTT TCTTGGGTGA GCGAGTCACC CTGACCTCCT

I could write a script to open each seqret output file and add this 
character to the top line of each, but before I dive into this, I'd like to 
know if there is any flag I can add to seqret to get the "I" added 
automatically.

Failing that, PAML takes the other, non-interleaved phylip format 
("sequential") by default, and that would not require any flag 
insertion.  Seqret also can produce this (using -osformat phylip3):

1 663 YF
c-barf1 ATGGCCAGGC TTTTCGCTCA GCTGCTCCTG CTCGCGGGCT CCGTCGCCTC
           CTGCCTGGCC GTCACCGCCT TTGTGGGTGA GCGGGCCGTC CTGAGTTCCT
           ACTGGAAGAG GGTGAGCCTA GGGCCCGAGA TCATGGTGGA ATGGTTCAAA

but then PAML won't read it because it doesn't like the YF flags inserted 
by seqret!!

So I either have to script to remove flags from sequential or insert them 
in interleaved, unless seqret has a solution.

All assistance gratefully appreciated
Derek




More information about the EMBOSS mailing list