[EMBOSS] Cannot open division file

Kann Vearasilp kann.vearasilp at mu.edu
Thu Oct 25 19:07:37 UTC 2007


Hello everyone,

I just finish indexing a genbank database for my lab using dbiflat  
command. I set up an emboss.default file referenced from  
emboss.default.template as it was provided. "seqret" is a command  
that is used to test the system, and it seems that EMBOSS could not  
find the division file.

I can see from the archive that there was this kind of problem with  
test database provided from emboss as well. (http://emboss.open- 
bio.org/pipermail/emboss/2005-November/002323.html). However, I am  
pretty sure that I correctly pointed the path to my database.  
However, here is my configuration.

The system is Mac OS 10.4

1. Emboss was installed from fink at /sw/share/EMBOSS

2. All database was installed in /lab/data/databases/genbank/*.seq

3. Index files are in /lab/data/indices/genbank/??? Here is an  
example of one of the index directory from my lab.

xxx at yyy/lab/data/indices/genbank/mam:
acnum.hit     des.trg       keyword.hit   seqvn.hit     taxon.trg
acnum.trg     division.lkp  keyword.trg   seqvn.trg
des.hit       entrynam.idx  mam.dbiflat   taxon.hit

4. Here is a fraction from my emboss.default file:

# Set location of acd files that describe each program
SET emboss_acdroot /sw/share/EMBOSS/acd


# Set location of Genbank flatfiles in protein
SET  emboss_database_dir /lab/data/databases

# Set location of Genbank flatfiles indices in protein
set emboss_index_dir /lab/data/indices

# Set a log file that user can append their records and EMBOSS  
automatically write log information
SET emboss_logfile /sw/share/EMBOSS/log/log

# Set Paper size of disc page and is required by the 'dbx' indexing  
program and 'method: "emblcd" emboss'
# Recommended value is 2048
SET PAGESIZE 2048

# Set Caches size required for 'dbx' indexing and 'method emboss'.
# It is a page size number to cache. Recommended value is 200
SET CACHESIZE 200

# Set parameter for flat file indices that we have created in
# /lab/data/indices/genbank
.
.
.
.
.
DB gbmam [
# required parameters
    method: "emblcd"
    format: "GB"
    type: "N"
    dir: "\$emboss_database_dir/genbank"
    file: "gbmam*.seq"
# optional parameters
    fields: "sv des key org"
    release: "161.0"
    comment: "Genbank database for mam sequences"
    indexdir: "\$emboss_index_dir/genbank/mam"
]

5. I run this seqret command to test the system, but it throw error  
and you can see:

xxx at yyy~:seqret gbmam:BC102801
Reads and writes (returns) sequences
Warning: Cannot open division file '<null>' for database 'gbmam'
Warning: seqCdQry failed
Error: Unable to read sequence 'gbmam:BC102801'
Died: seqret terminated: Bad value for '-sequence' and no prompt

6. I also run the seqret command in debug mode and this is its log  
from the command.

Debug file seqret.dbg buffered:No
ajAcdInitP pgm 'seqret' package ''
ajFileNewIn '/sw/share/EMBOSS/acd/seqret.acd'
EOF ajFileGetsL file /sw/share/EMBOSS/acd/seqret.acd
closing file '/sw/share/EMBOSS/acd/seqret.acd'
ajFileNewIn '/sw/share/EMBOSS/acd/codes.english'
EOF ajFileGetsL file /sw/share/EMBOSS/acd/codes.english
closing file '/sw/share/EMBOSS/acd/codes.english'
ajTableNewFunctionLen hint 25 size 251
ajTableNewFunctionLen hint 25 size 251
ajTableNewFunctionLen hint 25 size 251
ajFileNewIn '/sw/share/EMBOSS/acd/knowntypes.standard'
EOF ajFileGetsL file /sw/share/EMBOSS/acd/knowntypes.standard
closing file '/sw/share/EMBOSS/acd/knowntypes.standard'
Set acdprotein value '$(sequence.protein)'
ajSeqinClear called
++seqUsaProcess 'gbmam:BC102801' 0..0(N) '' 0
USA to test: 'gbmam:BC102801'

format regexp: No list:No
no format specified in USA

...input format not set
dbname dbexp: Yes
found dbname 'gbmam' level: '<null>' qry->QryString: 'BC102801'
seqQueryFieldC usa 'sv' fields 'sv des key org'
seqQueryField test 'sv'
seqQueryField match 'sv'
ajSeqQueryWild id 'BC102801' acc 'BC102801' sv 'BC102801' gi '' des  
'' org '' key ''
wild (has) query Sv 'BC102801'
database type: 'N' format 'GB'
use access method 'emblcd'
Matched seqAccess[2] 'emblcd'
seqAccessEmblcd type 2
directory '\$emboss_index_dir/genbank/mam' entry 'BC102801' acc  
'BC102801' hasacc:Yes
ajFileNewIn '\$emboss_index_dir/genbank/mam/division.lkp'
Database 'gbmam' : access method 'emblcd' failed
ajSeqinClear called
++seqUsaProcess 'gbmam:BC102801' 0..0(N) '' 0
USA to test: 'gbmam:BC102801'

format regexp: No list:No
no format specified in USA

...input format not set
dbname dbexp: Yes
found dbname 'gbmam' level: '<null>' qry->QryString: 'BC102801'
seqQueryFieldC usa 'sv' fields 'sv des key org'
seqQueryField test 'sv'
seqQueryField match 'sv'
ajSeqQueryWild id 'BC102801' acc 'BC102801' sv 'BC102801' gi '' des  
'' org '' key ''
wild (has) query Sv 'BC102801'
database type: 'N' format 'GB'
use access method 'emblcd'
Matched seqAccess[2] 'emblcd'
seqAccessEmblcd type 2
directory '\$emboss_index_dir/genbank/mam' entry 'BC102801' acc  
'BC102801' hasacc:Yes
ajFileNewIn '\$emboss_index_dir/genbank/mam/division.lkp'
Database 'gbmam' : access method 'emblcd' failed

It seems that the emboss could not find the division file. I still  
don't know what the problem is. Do you have any recommendation?

Thank you so much in advance for any help!

Kann




More information about the EMBOSS mailing list