[EMBOSS] Nthseq issue

Peter Rice pmr at ebi.ac.uk
Thu Jan 22 08:35:50 UTC 2009


Scott Hazelhurst wrote:
> 
> I don't know whether this is a bug or a feature, but I discovered  that
> nthseq skips empty sequences in its counting. So if you have 10 sequences
> and the  fifth is empty, then nthseq -number 6 actually returns the 7th
> sequence. It does print out a warning that the sequence is empty but not
> that its skipping (and also if you are putting this in a pipeline you
> wouldn't see it). I couldn't see any documentation on this.
> 
> I found this problem in a data set from some collaborators, we ran dust and
> then used biosed to remove Ns. Obviously this makes some sequences not
> usable. While it is understandable why nthseq behaves in the way it does,
> the problem is that in an automated set up it may be difficult do the
> adjustment.

We will, take a look. Zero length sequences are routinely ignored in 
EMBOSS. We will check whether it is possible to use an alternative method 
for counting in nthseq and any other application that counts input sequences.

Of course, if the nth sequence is empty nthseq would have to return a 
failure to read it.

regards,

Peter Rice



More information about the EMBOSS mailing list