[EMBOSS] Escaping query terms in a USA
hpm at ebi.ac.uk
Fri Aug 23 12:42:24 UTC 2013
> it seems the index is OK, just the database query code can not handle
> the ":" which has special meanings in USAs. So as workaround you can
> replace the ":" by a "*".
> entret -stdout -auto 'imgthla-key:A*02*364'
> will return the entry HLA08011.
> But be aware that by this you actually generate a wildcard query, so
> the * matches any single character at that position.
Unfortunately that is not going to work for this case since the HLA
alleles use a somewhat nested nomenclature, for example:
However a little experimentation indicates that EMBOSS supports the
single character wild-card '?', so something like:
$ entret -stdout -auto 'imgthla-key:A?01?02'
appears to do what I want in most cases.
That said, it would be better to have a way to escape the special
characters (i.e. '*', ':' and '?') in the search term when an exact
match is required (as in this case).
> Kind regards, David.
> -----Ursprüngliche Nachricht----- Von:
> emboss-bounces at lists.open-bio.org
> [mailto:emboss-bounces at lists.open-bio.org] Im Auftrag von Hamish
> McWilliam Gesendet: 23 August 2013 11:25 An:
> emboss at lists.open-bio.org Betreff: [EMBOSS] Escaping query terms in a
> Hi folks,
> In the IMGT/HLA database (http://www.ebi.ac.uk/ipd/imgt/hla/) the
> keywords field in the EMBL-Bank format flat-file contains allele
> names like:
> While I can build an index containing the keywords, it does not
> appear to be possible to search the index with the allele names. For
> $ entret -stdout -auto 'imgthla-key:Allele'
> works as expected, but:
> $ entret -stdout -auto 'imgthla-key:A*02:364'
> just gives errors:
> Error: Failed to open filename 'imgthla-key' Error: Unable to read
> sequence 'imgthla-key:A*02:364' Died: entret terminated: Bad value
> for '-sequence' with -auto defined
> I am guessing that the problem is the '*' and ':' characters in the
> term... so is there some way to escape these or are the terms in the
> index mangles in some way?
> All the best,
Mr Hamish McWilliam,
European Bioinformatics Institute (EMBL-EBI),
European Molecular Biology Laboratory,
Wellcome Trust Genome Campus,
Hinxton, Cambridge, CB10 1SD
More information about the EMBOSS