[DAS] encoding URIs

Andy Jenkinson andy.jenkinson at ebi.ac.uk
Thu Dec 17 18:11:37 UTC 2009


Hi list,

In the upcoming 1.6 spec I'd like to clarify the issue of content escaping in DAS responses (and requests), and since this is quite subtle but can potentially affect all servers and clients I thought I'd give a bit of a heads up. Sorry, it's a bit boring...

For some background: DAS sources are historically not as careful with content encoding as they might be, but this largely only manifests as display issues in clients. These days ProServer (which represents the bulk of sources in the wild) does a much better job by encoding most things, anyway. I planned to simply clarify that all content must be XML compliant (angle brackets, quotes etc must be escaped). However, XML and URIs have different escaping requirements, because some characters are fine in XML but not in URIs. For example, the space and pipe characters. I am now beginning to see these characters in URIs, and this affects clients in a functional way. For example, Ensembl's 'add DAS source' wizard is sensitive to it, and will not match source URIs with filters unless spaces and pipes are escaped in the sources document (and escaped in the same way, in fact).

I plan to include in the spec that all XML element contents and attributes must be XML-escaped, except for certain fields that can contain URIs which must be URI-escaped. That is, the URI fields in the sources response, href attributes and probably ID fields too. Escaping can be done via numeric OR name (i.e. < OR <). The implications for clients are a little more subtle: when doing operations that compare two fields, it is necessary to decode them first in case they are encoded differently. If the client uses HTML as a display medium there are further complications, as some elements (e.g. links) require URI-escaped content but others do not. So in short, both decoding and encoding are content-specific. It's worth noting that both Ensembl and ProServer need modification before they satisfy these requirements, so I guess other software will too.

Any comments or questions, just shout.
Cheers,
Andy






More information about the DAS mailing list