[DAS] DAS and bacterial genomes
andy.jenkinson at ebi.ac.uk
Tue Aug 17 17:07:48 UTC 2010
There are no coordinate systems yet as nobody has yet been brave enough to start using DAS with bacteria in anger. Eugene at Ensembl Genomes will have an interest in doing this, but they have issues with matching up their species/strain names with the NCBI taxonomy upon which DAS's coordinates are based. In essence if you will need to name the coordinate systems after which they will need to be added to the registry.
For example when Ensembl Genomes manage to do this, the coordinate systems might end up looking like:
EB_1,Chromosome,Shigella flexneri 2a str. 301
EB_1,Plasmid,Shigella flexneri 2a str. 301
This is for a specific shigella strain with taxonomy ID 198214. The authority and version parts of the DAS coordinate system are somewhat arbitrarily named, ideally they would be a standard that is used by the rest of the community for interoperability purposes.
What exactly is it you'd like to be able to do? How many species' are we talking about?
The reason I ask is that getting these coordinate systems into the DAS registry does require some work. Some of this is on the registry's side, but depending where your data come from there may be issues with identifying the correct coordinate system details such that others can reuse them meaningfully. To use the example above, Ensembl Genomes give the "301" strain a different name from NCBI and use the taxonomy ID not for the strain but for the parent species (Shigella flexneri). In fact the 2457T strain also uses the same taxonomy ID, which isn't helpful. Given the number of species', this adds up to a major headache.
On 17 Aug 2010, at 16:49, Adam Witney wrote:
> What would be the best approach to use DAS with bacterial genomes? I can't seem to find any coordinate systems for these organisms in the Registry.
> Thanks for any advice
> DAS mailing list
> DAS at lists.open-bio.org
More information about the DAS