[EMBOSS] Databases

Peter Rice pmr at ebi.ac.uk
Mon Mar 10 16:22:02 UTC 2008


Dear Nick,

Staffa, Nick (NIH/NIEHS) wrote:
> We are making a gradual transition from GCG to EMBOSS.
> Please point me to some explicit instructions for making GCG databases
> accesable to EMBOSS/Jemboss

EMBOSS has two database indexing applications for GCG databases: dbxgcg
and the older dbigcg. There is a fix for release 5.0.0 for changes to
the sayntax of GCG database ID lines.

This assumes you are still planning to maintain GCG databases - though I
understand UniProt broke GCG recently by having records longer than 255 
characters.

EMBOSS can also index the native flatfiles with dbxflat, and FASTA files 
with dbxfasta.

With databases indexed in dbxgcg the database definitions look like:

DB gcgnuc [
    type: N
    method: embossgcg
    format: embl
    directory: "/homes/pmr/data/gcg/gcg_data"
    indexdir: "/homes/pmr/gcg/index"
]

format refers to the text part of the entry (the REF file)

With databases indexed in the older dbigcg the database definitions look
similar. Change "embossgcg" to "gcg" for the method.

Let us know if you ave any more questions.

Peter



More information about the EMBOSS mailing list