[emboss-dev] USA syntax and `%' character in sequence file names

Nicolas Joly njoly at pasteur.fr
Tue Nov 4 14:16:35 UTC 2008

On Mon, Nov 03, 2008 at 11:19:44AM +0000, Peter Rice wrote:
> Nicolas Joly wrote:
> >On Thu, Oct 23, 2008 at 10:47:54PM +0100, ajb at ebi.ac.uk wrote:
> >>Hi Nicolas,
> >>
> >>What it does, given a USA like:
> >>
> >>    foo%10
> >>
> >>is to seek 10 bytes into file foo and try to start
> >>reading a sequence from there. It does not, however, currently check that
> >>what appears after the '%' is a valid number. I believe invalid numbers
> >>are equivalent to an offset of 0.
> >>
> >>I suspect it might have been intended as a useful debugging tool for
> >>the programmer rather than as something for the biologist.
> >>If we leave it as an option we ought to mention it the documentation
> >>in some form though.
> >
> >Thanks, Alan. Personally, i would get rid of it. But if you plan to
> >keep it, please check for valid numbers before using it.
> We do need it - for saving USAs when reading files.
> For example, sequence file formats where the ID is not unique or has to be 
> generated. Also potentially useful together with the offsets stored by the 
> database indexing systems and for future use with other data types.
> Yes, we will fix it to check that the number is valid... and add to the 
> documentation.

Ok. Thanks.

Nicolas Joly

Biological Software and Databanks.
Institut Pasteur, Paris.

More information about the emboss-dev mailing list