[EMBOSS] FW: Reducing a FASTA repository, new user

Peter Rice pmr at ebi.ac.uk
Tue Feb 15 08:59:20 UTC 2011


On 14/02/2011 23:35, Marvin Stodolsky wrote:
>   This is elementary I’m sure, but I’ve been unable to work out the
> syntax  from the documentation.
> More minor issue.
>
> When using infoseq to extract all the fasta Headers from a sequence
> Repository, the GeneBegin..GeneEnd (like   234466..234589) often fails to
> come as a uniform field/fields in a resultant spreadsheet.  Is there a Fix
> for this?

I don't see the genebegin and geneend in EMBOSS infoseq output. Are they 
part of the sequence ID in the FASTA file?

You can use a delimiter between items for infoseq using:

  -nocolumn

on the command line.

For import into a spreadsheet you can set the delimiter to be tab with:

  -nocolumn -delimiter "\t"

on the command line. That should then import nicely into a spreadsheet.

Hope that helps

Peter Rice
EMBOSS Team



More information about the EMBOSS mailing list