[EMBOSS] Error with dbiflat and Refseq
pmr at ebi.ac.uk
pmr at ebi.ac.uk
Wed Jun 14 22:05:17 UTC 2006
Hi Aengus,
> dbiflat is dying on me with the refseq nucleotides
>
> I am using
>
> dbiflat -dbname RefSeqN -idformat refseq -directory . -filenames "*.gbff"
> -release 17.0 -date 14/06/06 -fields "acnum,seqvn,des,keyword,taxon"
>
> and I get
>
> Warning: Duplicate ID skipped: 'XM_757618' All hits will point to first ID
> found
> Warning: Duplicate ID skipped: 'XM_757619' All hits will point to first ID
> found
> Warning: Duplicate ID skipped: 'XM_757620' All hits will point to first ID
> found
>
> EMBOSS An error in embdbi.c at line 1238:
> Error in embDbiSortWriteFields, expected entry NM_001004399 not found
Hmmm ... what refseq files are you using (I'm trying now with the
ftp://ftp.ncbi.nih.gov/refseq/release/complete/*.gbff.gz files)
Could you have a duplicate entry from an old file?
Could your sort space have filled up? Running with -noclean will leave the
temporary files around and make it easier to check for truncation - though
simpoly rerunning would probably give a different result if that is the
problem (note to self - must find a way to test the error messages without
really filling up a disk)
regards,
Peter
More information about the EMBOSS
mailing list