[EMBOSS] Escaping query terms in a USA

Hamish McWilliam hpm at ebi.ac.uk
Fri Aug 23 12:42:24 UTC 2013

Hi David,

> it seems the index is OK, just the database query code can not handle
> the ":" which has special meanings in USAs. So as workaround you can
> replace the ":" by a "*".
> entret -stdout -auto 'imgthla-key:A*02*364'
> will return the entry HLA08011.
> But be aware that by this you actually generate a wildcard query, so
> the * matches any single character at that position.

Unfortunately that is not going to work for this case since the HLA 
alleles use a somewhat nested nomenclature, for example:


However a little experimentation indicates that EMBOSS supports the 
single character wild-card '?', so something like:

$ entret -stdout -auto 'imgthla-key:A?01?02'

appears to do what I want in most cases.

That said, it would be better to have a way to escape the special 
characters (i.e. '*', ':' and '?') in the search term when an exact 
match is required (as in this case).



> Kind regards, David.
> -----Ursprüngliche Nachricht----- Von:
> emboss-bounces at lists.open-bio.org
> [mailto:emboss-bounces at lists.open-bio.org] Im Auftrag von Hamish
> McWilliam Gesendet: 23 August 2013 11:25 An:
> emboss at lists.open-bio.org Betreff: [EMBOSS] Escaping query terms in a
> Hi folks,
> In the IMGT/HLA database (http://www.ebi.ac.uk/ipd/imgt/hla/) the
> keywords field in the EMBL-Bank format flat-file contains allele
> names like:
> A*02:364
> While I can build an index containing the keywords, it does not
> appear to be possible to search the index with the allele names. For
> example:
> $ entret -stdout -auto 'imgthla-key:Allele'
> works as expected, but:
> $ entret -stdout -auto 'imgthla-key:A*02:364'
> just gives errors:
> Error: Failed to open filename 'imgthla-key' Error: Unable to read
> sequence 'imgthla-key:A*02:364' Died: entret terminated: Bad value
> for '-sequence' with -auto defined
> I am guessing that the problem is the '*' and ':' characters in the
> term... so is there some way to escape these or are the terms in the
> index mangles in some way?
> All the best,
> Hamish

Mr Hamish McWilliam,
Web Production,
European Bioinformatics Institute (EMBL-EBI),
European Molecular Biology Laboratory,
Wellcome Trust Genome Campus,
Hinxton, Cambridge, CB10 1SD
United Kingdom

URL: http://www.ebi.ac.uk/

More information about the EMBOSS mailing list