Index EMBL and EMBLnew
Peter Rice
pmr at ebi.ac.uk
Mon Mar 17 10:46:25 UTC 2003
yann bizouerne wrote:
> I want to work with EMBL and EMBLnew. I have index the EMBL files and
> the EMBL new filers in separate directories.(/EMBL/ & /EMBLnew/).
>
> I did this because I just want to re-index EMBLnew when new sequences
> are coming, and not all the sequences (EMBL + EMBLnew)
>
> Now I want to interogate againts these two databases with one request.
> How could I do such thing ? Is it the good way to work with EMBL or not ?
This is my next EMBOSS task!!!
You can of course already do this with SRS.
If you put both databases together, EMBOSS will have problems with
duplicate IDs.
For now, the EMBOSS solution is to use "whichdb" which will search all
your databases for an entry. If it reports an entry in EMBL and EMBLNEW
you can use the EMBLNEW entry.
I am planning to extend the EMBOSS "USA" syntax to include features of
the SRS query language, including a query of more than one database,
more than one field, and more than one text string (and of course ...
more than one query)
To query EMBL and EMBLNEW you also need to exclude matching entries - I
can add that by excluding matching IDs from EMBL. I hope to do this by
defining an "EMBLALL" database to make life easier for users.
SWISSPROT/SWISSNEW/SPTREMBL is more difficult because the matches have
to be by accession number. There is already a non-redundant "swall"
database available so this is not such a high priority and may have to
wait for a way to link databases in EMBOSS (but the internal EMBOSS code
does allow for this kind of extension).
Hope this helps,
Peter Rice
More information about the EMBOSS
mailing list