[EMBOSS] seqret problem?

Tue Jul 20 10:20:26 UTC 2004

Zhiqiang Ye wrote:

> hi all，
>       I find that if there is a semicolon in the description line, seqret works wrong.

seqret works "correctly" ... but it is very confusing in this case.

The ">P1;" format is PIR format for a complete protein (fragemnts have
">F1;"). The rest of the line is the ID.

But in PIR format, the next line is the description - so EMBOSS will
read the first line of your sequence as a description, which is why it
appears on the first line of the FASTA format output.

You will find "seqret -sf pir" works the same way, and "seqret -sf
fasta" will complain about the sequence format.

I think there is no way to avoid this problem - because EMBOSS does not
know that the next line is not a sequence.

Hope that make things clearer,

Peter Rice

  >P1;Z1BPC2
  MELTSTRKKANAITSSILNR IAIRGQRKVA DALGINESQI
  SRWKGDFIPK MGMLLAVLEW GVEDEELAEL AKKVAHLLTK EKPQDCGNSF EA

  [yezq at pro des]$ seqret test
  Reads and writes (returns) sequences
  Output sequence [z1bpc2.fasta]:
  [yezq at pro des]$ more z1bpc2.fasta

  >Z1BPC2 MELTSTRKKANAITSSILNR IAIRGQRKVA DALGINESQI
  SRWKGDFIPKMGMLLAVLEWGVEDEELAELAKKVAHLLTKEKPQDCGNSFEA