USAs2
Peter Rice
pmr at ebi.ac.uk
Fri Jul 2 18:38:02 UTC 2004
Hi Tamas,
Thanks for the suggestion!
It is late on Friday, so I will give it some thought over the weekend.
> I would like to know if it is possible to hack ajax to handle similar USAs
> listed below and !!!HOW!!!:
> - USA:kw=something, ft=sthelse.
> - USA:SELECT * FROM mytable WHERE..
Yes, it is possible. But still a hack ... which means we have not yet
implemented it.
This is really an extended query language. I tried to define such
extensions last year when I moved back to academia, but have not yet had
time to implement anything.
This is an excellent time to start defining extended USAs.
My plan was:
Start by thinking about the "SRS query language". You can search for
various "fields":
id (entry ID)
acc (accession number)
sv (sequence version ... and maybe GI number)
des (description)
key (keyword phrase)
org (taxonomy)
... and a few more ...
In SRS, you can use & (and), | (or) ! (but not) to combine search terms
In SRS you can also use > and < to follow links to and from other
databases. SRS has only one link between any pair of databases - I would
rather like to use named links so we can choose which links to use.
I would like to allow mulitple databases in the USA. There are some
problems choosing a good syntax.
I would also like to allow multiple fields - obviously id and acc, or
combining text fields.
Then, as you suggest, some SQL-like syntax would be nice.
It looks complicated, but we can work in small steps.
In all cases, we need to make this work with "EMBLCD" indexing, with
reading flatfile data, and with any other indexing system. We can also
try to make it work with SRS and SRSWWW (easy in some cases, hard in others)
> I see you are working on pattern searches.
> It would be great to have the possibility to define patterns in the
> fuzzpro by USA: fuzzpro -pattern=USA:patt_name USA:seq
> I think the implementation of this would be useful.
> Return 'value' could be a 'fasta' pattern file:
If I understand correctly, you want to define a file of named patterns,
and select one using a "USA" syntax.
This is not so simple ... because programs usually want only one type of
pattern.
However, in ACD we can give the pattern a "knowntype" attribute so
EMBOSS (and any wrapper) knows what type of pattern is allowed.
We can then use Henrikki Almusa's pattern list to define a file of
patterns, and some pattern syntax to say which pattern(s) to use.
We do have a problem - we need to make these pattern "USAs" different
from simple patterns. We also need a name for pattern definitions. I am
sure we can think of one.
regards,
Peter Rice
More information about the emboss-dev
mailing list