Index EMBL and EMBLnew

Peter Rice pmr at ebi.ac.uk
Mon Mar 17 10:46:25 UTC 2003


yann bizouerne wrote:
> I want to work with EMBL and EMBLnew. I have index the EMBL files and 
> the EMBL new filers in separate directories.(/EMBL/ & /EMBLnew/).
> 
> I did this because I just want to re-index EMBLnew when new sequences 
> are coming, and not all the sequences (EMBL + EMBLnew)
> 
> Now I want to interogate againts these two databases with one request. 
> How could I do such thing ? Is it the good way to work with EMBL or not ?

This is my next EMBOSS task!!!

You can of course already do this with SRS.

If you put both databases together, EMBOSS will have problems with 
duplicate IDs.

For now, the EMBOSS solution is to use "whichdb" which will search all 
your databases for an entry. If it reports an entry in EMBL and EMBLNEW 
you can use the EMBLNEW entry.

I am planning to extend the EMBOSS "USA" syntax to include features of 
the SRS query language, including a query of more than one database, 
more than one field, and more than one text string (and of course ... 
more than one query)

To query EMBL and EMBLNEW you also need to exclude matching entries - I 
can add that by excluding matching IDs from EMBL. I hope to do this by 
defining an "EMBLALL" database to make life easier for users.

SWISSPROT/SWISSNEW/SPTREMBL is more difficult because the matches have 
to be by accession number. There is already a non-redundant "swall" 
database available so this is not such a high priority and may have to 
wait for a way to link databases in EMBOSS (but the internal EMBOSS code 
does allow for this kind of extension).

Hope this helps,

Peter Rice




More information about the EMBOSS mailing list