[EMBOSS] case sensitive identifiers - Checked by AntiVir DEMO version -
gbottu at ben.vub.ac.be
Mon Oct 2 08:11:46 UTC 2006
On Fri, Sep 29, 2006 at 11:27:51AM +0100, Peter Rice wrote:
> So, there will be 2 new (and for the first time boolean) attributes for
> databases. To use them, you will need:
> caseidmatch: "Y"
> hasaccession: "N"
The "hasaccession" attribute is certainly useful for search methods like
SRS and MRS who have the notion of searching in separate indexes. By
default searching both "id" and "ac" is the thing to do, but there are
databanks where there is no "ac" indexed or there are databanks, like
EMBL or IMGTHLA, where the "id" and the "ac" are always identical, so
that searching only the "id" gains time without loosing functionality.
As for the case problem, I think we agree that the best is to always
handle the sequence name as such (case as typed by the user) to the
search method and in case the search method itself is not case senstive
but the databank is, let EMBOSS if 'hasaccession: "Y"' parse the
retrieved sequences and accept only those who match. This will work fine
for SRS (and of course for the method "direct", where EMBOSS does all the
work), but it will not work for MRS, since the current version of MRS
does not allow case-different index words.
More information about the EMBOSS