[BioSQL-l] case sensitivity in biosql accessions under BioJavax/Hibernate

James Procter jimp at compbio.dundee.ac.uk
Thu Dec 4 16:17:05 UTC 2008


Hi Hilmar
Hilmar Lapp wrote:
<SNIP>
> Sorry for chiming in a bit late here. Accessions are case sensitive in
> BioSQL at the level of the relational model.
Ah. OK. This is a definitive answer.
> In fact, this is enforced for MySQL (which unilaterally chose to treat
> the SQL VARCHAR datatype as case-insensitive) by making the type VARCHAR
> BINARY.
I did wonder about that. This makes sense.
> I'm rather disinclined to change that, I have to say. I realize that
> many (all?) of the databases we typically use treat accessions as
> case-insensitive. But I doubt that that's part of the specs in each
> case, and there is no standard that would oblige future databases to do
> the same.
> 
> Rather, I think it's application (or data source) level semantics, and
> should hence be implemented at that level if desired. In full-featured
> RDBMSs that's actually not very difficult. For example, you can build a
> function index on UPPER(accession), which gives you indexed access to
> case-insensitive accessions w/o changing the model itself. As Mark
> mentioned, Hibernate can be taught to use functions like these too.

Fair enough. This squarely places the onus on the application/datasource
to be careful about case. It also corroborates the transparent behaviour
exhibited by Biojava and Bioperl. The worrying aspect is that as far as
I can tell, protocols like DAS do not play nicely with this... case is
usually ignored for ID lookup. I guess this means that the middleware
that does the transformation is always going to have to (optionally) use
specific case-insensitive BioSQL language bindings, and the datasource
deployer will simply have to configure it accordingly for their BioSQL
database.

cheers!
Jim



More information about the BioSQL-l mailing list