[DAS2] xlm:base -- fer it or agin' it?

Andrew Dalke dalke at dalkescientific.com
Tue Aug 15 15:16:35 UTC 2006


I see three reasonable options (or rather, logically defensible)
related to xml:base in DAS2 documents.

1) don't us it at all
2) only have it in the root element of the document
3) have it anywhere in the document

(this is the old programming dictum of "the only limits should
be 0, 1 and infinity")

Pros and cons:

#1 is the least confusing.  Given relative URL, use the document's
url to make it absolute, etc. as per URI spec.

#2 This is similar to the restrictions in the BASE element in the
HTML header.  (Which I've only used once.)  It's used most often
in saved documents so relative URLs work without needing to rewrite
the rest of the document.  Take your DOM, and stick the URL in the
root node if "xml:base" is not present, otherwise do

   root.attrib["xml:base"] <-- urljoin(document_url, 
root.attrib["xml:base"])

#3 This is the most complicated.  The main use case mentioned
was support for xinclude, which is not something anyone here has
said they need.  For all I know it may be useful XSLT and other
languages.  I don't know the XML toolchain well enough.

Here is another use case.  Consider a registration / aggregation
service.  It could work by fully parsing everything from each client
and making absolute URIs for everything.  Or it could do

<SOURCES>
   <SOURCE xml:base="http://cshl.org/" uri="/das2/worm/160>
    ...
   </SOURCE>
   <SOURCE xml:base="http://ebi.ac.uk" uri="/das2/human/35">
    ...
   </SOURCE>
</SOURCE>

That is, it reads the sources document and pulls the SOURCE
elements out of the XML.  It sticks in the right xml:base (perhaps
with a set of joins from the parent elements in the document)
and serves the result.  No need to parse further.

Here's another.  Consider a meta-feature server which sucked
in primary records from multiple other servers (with permission).
It might provide better search capabilities, better ranking,
whatever.  The features are unchanged.  The server wants to
return the results as it got them from the original server.

Without xml:base it needs to convert all relative URLs into
absolute ones

<FEATURES>
   <FEATURE uri="absolute_url_to_server_A/feature001">
     ...
   </FEATURE>
   <FEATURE uri="absolute_url_to_server_B/F_fpklkwef">
     ...
   </FEATURE>
   <FEATURE uri="absolute_url_to_server_A/feature942">
     ...
   </FEATURE>
   ...
</FEATURE>

which requires the server know about all field which are
URLs.  This precludes support for any extensions which
include URL fields because the meta-server won't know
about them.  OTOH, with xml:base

<FEATURES>
   <FEATURE xml:base="absolute_url_to_server_A" uri="/feature001">
     ...
   </FEATURE>
   <FEATURE xml:base="absolute_url_to_server_B" uri="/F_fpklkwef">
     ...
   </FEATURE>
   <FEATURE xml:base="absolute_url_to_server_A" uri="/feature942">
     ...
   </FEATURE>
   ...
</FEATURE>

and any embedded extensions work w/o problems.

Hence I'm fer numb'r 3.

					Andrew
					dalke at dalkescientific.com




More information about the DAS2 mailing list