[Biojava-l] A proposal: The Extensible Feature Format (XFF)
Thomas Down
td2@sanger.ac.uk
Thu, 8 Mar 2001 12:42:15 +0000
Some time ago, I developed a prototype schema for a new
XML-based feature table format. This grew out of an earlier
effort to write the XML Schema grammer for the GAME elements,
but turned into something significantly different.
The philosophy of XFF is to define a format for a `minimalistic'
feature table -- in some ways, what GFF is to other flatfile
feature formats. But since this is XML, we can then use this
as a skeleton to hang other, application-specific, information
off. However much detail gets added, it should still be possible
for a naive XFF-processing application to extract at least the
basic information (location, strand, type, etc.). This means
it can be used in a very wide range of applications, ranging from
`drawing coloured boxes' type data presentation up to very
semantically rich communication and storage.
It also differs from a lot of other formats in being fully
hierarchical. So, for example, you can group some exon
features together within a gene feature. This model comes
unashamedly from BioJava, where we've had some good experiences
with hierarchical features.
Anyway, I've been finding the format very useful for some
internal projects, so I've decided to write it up to see
if there's any public interest. You can browse a proposal
at:
http://www.biojava.org/thomasd/XFF/
There are still a few open issues, especially relating
to feature ids and idrefs, but this should give a good
general idea of what the format is about. I've got some
BioJava code for reading and writing XFF -- this is currently
a bit tangled up with other things, but I'll try to get it
out in the open in the next few days.
Is anyone interested in this format? If so, I'd like to get
some discussion going with regards to getting a more formal
standard defined. In this case, it would be good to move
everything over to bioxml.org, so long as people there are
interested.
Let me know what you think,
Thomas.