[EMBOSS] case sensitive identifiers - Checked by AntiVir DEMO version -

Guy Bottu gbottu at ben.vub.ac.be
Mon Oct 2 08:11:46 UTC 2006


On Fri, Sep 29, 2006 at 11:27:51AM +0100, Peter Rice wrote:
> So, there will be 2 new (and for the first time boolean) attributes for 
> databases. To use them, you will need:
> 
> caseidmatch: "Y"
> hasaccession: "N"

The "hasaccession" attribute is certainly useful for search methods like 
SRS and MRS who have the notion of searching in separate indexes. By 
default searching both "id" and "ac" is the thing to do, but there are 
databanks where there is no "ac" indexed or there are databanks, like 
EMBL or IMGTHLA, where the "id" and the "ac" are always identical, so 
that searching only the "id" gains time without loosing functionality.

As for the case problem, I think we agree that the best is to always 
handle the sequence name as such (case as typed by the user) to the 
search method and in case the search method itself is not case senstive 
but the databank is, let EMBOSS if 'hasaccession: "Y"' parse the 
retrieved sequences and accept only those who match. This will work fine 
for SRS (and of course for the method "direct", where EMBOSS does all the 
work), but it will not work for MRS, since the current version of MRS 
does not allow case-different index words.

	Guy




More information about the EMBOSS mailing list