[EMBOSS] dbifasta possible problem

Aengus Stewart aengus.stewart at cancer.org.uk
Thu Feb 5 18:44:06 UTC 2004


I have been using Warren Gish's nrdb2 to create a non-redundant protein dataset

It generates FASTA title lines like

>swiss:ACDD_METMA Q8PRQ5 Acetyl-CoA decarbonylase/synthase complex delta subunit (ACDS complex delta subunit) (Corrinoid/iron-sulfur component small subunit).trembl:Q8PRQ5 Q8PRQ5 CO dehydrogenase/acetyl-COA synthase delta subunit (EC 1.2.99.2).refseqp:NP_632712 NP_632712 Methanosarcina mazei Goe1 CO dehydrogenase/acetyl-COA synthase delta subunit [Methanosarcina mazei Goe1]. 0/0refseqp:NP_634109 NP_634109 Methanosarcina mazei Goe1 CO dehydrogenase/acetyl-COA synthase delta subunit [Methanosarcina mazei Goe1]. 0/0

Where it cats together the titles from the different DBs that have identical entries.

I have discovered that dbifasta will only accept -idformat simple to process this file

I assumed I could use -idformat gcgidacc

but this and any other -idformat bar simple causes dbifasta to "finish"

I say finish as there is no errors or complaints it just ends...........

I didnt expect this, is this correct behaviour?


Cheers
Aengus


-- 
----------------------------------------------------------------------------
Aengus Stewart
Group Leader
Computational Genome Analysis Laboratory  Tel: +44 (0)20 7269 3679
Cancer Research UK, Lincoln's Inn Fields, Holborn, London, WC2A 3PX, UK
----------------------------------------------------------------------------



More information about the EMBOSS mailing list