[EMBOSS] How to indexing a bactria genome ?

pmr at ebi.ac.uk pmr at ebi.ac.uk
Thu Nov 23 23:58:22 UTC 2006

Hi Jerome,

> I'm trying to use the flat file of a genome of a bacteria.
> As i can read in this file (U00096_GR.dat), i've get all the information
> about CDS, proteins and a lot more. For example, there is a information
> about "aspartate kinase activity".. And i would like to indexing this
> information as "Description" in my emboss database.
> But when i do the indexing, the result is just on sequence, the all
> geneme nuclear acid.
> I don't know if dbiflat permits what i want to do, or i would use
> another type of flat file?.

We had the same request from Rodrigo Lopez at EBI a few weeks ago.

We are thinking about the best way to do it. We have to make names for the
"entries", and allow retrieval of single features. For CDS features we may
want to index also as a protein database.

Bacterial genomes are simple cases - we also have to consider eukaryote
genomic contigs where CDSs are ascorr more than one entry.

We hope to have a prototype available in the next release.

How many other users are interested in indexing CDSs?

Any suggestions for how to name the CDS "entries"?



More information about the EMBOSS mailing list