[EMBOSS] case sensitive identifiers - Checked by AntiVir DEMO version -

Guy Bottu gbottu at ben.vub.ac.be
Thu Sep 28 13:57:40 UTC 2006


	Dear colleagues,

Thure Etzold, the developer of SRS, once said "You cannot imagine anything 
that crazy or there is at least one database manager who really does it". 
While trying to put in our MRS server databanks with the protein and 
nucleic acid sequences extracted from the PDB, I bumped on the following 
problem : some have identifiers only different by case. E.g. there is a 
1fnt_A and a 1fnt_a. Now, most bioinformatic software is not case 
sensitive. I understand that MRS stores indices so that they can be 
displayed in their original case, but can only be searched
case-insensitively ; it does automatically modify redundant indices, e.g. 
1fnt_a is stored as 1fnt_a_12835. This is however not ideal. Should MRS 
be adapted so that it can handle case sensitive indices ? This will 
however not solve everything, since other software like EMBOSS or GCG is 
also case insensitive. My idea is to let the MRS parser store 1fnt_aLC
(LC means lowercase) as identifier. A user can then search for the 
sequence he needs in MRS and in EMBOSS (if the EMBOSS installation uses 
MRS as databank access mechanism) ask for the sequence pdbprot:1fnt_alc.
This would of course also work with 1fnt_a_12835 but it avoids the use of 
a meaningless and irreproducible number. Anybody a comment ?

	Regards,
	Guy Bottu,
	Belgian EMBnet Node




More information about the EMBOSS mailing list