[EMBOSS] indexing of nr file with dbxfasta is missing some data

Allen Smith easmith at beatrice.rutgers.edu
Thu Oct 13 16:32:56 UTC 2005


I've used dbxfasta to index the NCBI nr database file of ~Sep  7 10:33 (as
earlier reported - problems with filename of "nr" fixed by going to
"nr.fasta", with a thank-you to Alan), with the idformat being ncbi. It
doesn't seem to be picking up anything beyond the first ID - none of the
same-sequence IDs added in with control-A's seperating them.

	      -Allen

seqret 'nr:ZP_00652394*'
Reads and writes (returns) sequences
Error: Unable to read sequence 'nr:ZP_00652394*'
Died: seqret terminated: Bad value for '-sequence' and no prompt

seqret 'nr:ZP_00682210*'
Reads and writes (returns) sequences
Output sequence [zp_00682210.fasta]: stdout
>ZP_00682210.1 ZP_00682210.1 Zinc-containing alcohol dehydrogenase superfamily [Xylella fastidiosa Ann-1]gi|71730149|gb|EAO32237.1| Zinc-containing alcohol dehydrogenase superfamily [Xylella fastidiosa Ann-1]gi|71276114|ref|ZP_00652394.1| Zinc-containing alcohol dehydrogenase superfamily [Xylella fastidiosa Dixon]gi|71163032|gb|EAO12754.1| Zinc-containing alcohol dehydrogenase superfamily [Xylella fastidiosa Dixon]
MFINAYGAHAGDKPLESMQIARRAPGVHDVQIDIHYCGVCHSDIHLVRSEWAGTLFPCVP
GHEIVGRVSAIGTHVQGFKAGDLVAVGCMVDSCKDCQECDAGLENYCDGMIGTYNFPTQD
APGHTLGGYSQKIVVHERFVLRIRHPEAQLAAVAPLLCAGITTYSPLRHWNAGPGKKVGI
VGIGGLGHMGIKLAHAMGAYVVAFTTSESKRQDAKALGADEVVVSRDEERMAAHVKSFDL
ILNTVAASHSLDPFLTLLKRDGTLTLVGAPATPHPSPEVFNLIFKRRSIAGSLIGGIAET
QQMLDFCAKHGIVADIELIRADGINEAYERMMKGDVKYRFVIDNATLAA



-- 
Allen Smith                       http://cesario.rutgers.edu/easmith/
February 1, 2003                               Space Shuttle Columbia
Ad Astra Per Aspera                     To The Stars Through Asperity
safety." - Benjamin Franklin



More information about the EMBOSS mailing list