[EMBOSS] Error with dbiflat and Refseq

pmr at ebi.ac.uk pmr at ebi.ac.uk
Wed Jun 14 22:05:17 UTC 2006


Hi Aengus,

> dbiflat is dying on me with the refseq nucleotides
>
> I am using
>
> dbiflat -dbname RefSeqN -idformat refseq -directory . -filenames "*.gbff"
> -release 17.0 -date 14/06/06 -fields "acnum,seqvn,des,keyword,taxon"
>
> and I get
>
> Warning: Duplicate ID skipped: 'XM_757618' All hits will point to first ID
> found
> Warning: Duplicate ID skipped: 'XM_757619' All hits will point to first ID
> found
> Warning: Duplicate ID skipped: 'XM_757620' All hits will point to first ID
> found
>
>    EMBOSS An error in embdbi.c at line 1238:
> Error in embDbiSortWriteFields, expected entry NM_001004399 not found


Hmmm ... what refseq files are you using (I'm trying now with the
ftp://ftp.ncbi.nih.gov/refseq/release/complete/*.gbff.gz files)

Could you have a duplicate entry from an old file?

Could your sort space have filled up? Running with -noclean will leave the
temporary files around and make it easier to check for truncation - though
simpoly rerunning would probably give a different result if that is the
problem (note to self - must find a way to test the error messages without
really filling up a disk)

regards,

Peter




More information about the EMBOSS mailing list