[EMBOSS] Counting the number of sequences in a file

Peter biopython at maubp.freeserve.co.uk
Tue Jul 20 20:01:24 UTC 2010


On Tue, Jul 20, 2010 at 6:04 PM, Peter Rice <pmr at ebi.ac.uk> wrote:
>
> On 20/07/10 17:27, Peter C. wrote:
>> $ countseq -sformat=genbank gbvrt1.seq
>> 31065
>
> Of course, you could just use:
>
> $ seqret -filter -sformat=genbank gbvrt1.seq | grep -c '^>'
> 31065
>
> :-)
>

Exactly what I had in mind as the work around ("handle this by
using seqret to convert the file into FASTA and then pipe that
though grep to count the records"), although I'd not thought
about the fact that FASTA is the default output format which
keeps it nice and short. The (Unix) command line can be great :)

Peter C



More information about the EMBOSS mailing list