[EMBOSS] seqret/entret problems using acc from ensembl-embl

David Guzman david.guzman at uniklinik-freiburg.de
Mon Nov 20 14:31:43 UTC 2006


Dear all,

I am experiencing a very strange problem using seqret and entret. I have
downloaded and indexed with dbiflat the database files from the Homo
sapiens subsection of Ensembl (latest version as of Nov 15th).
When I try to get one sequence with seqret or the complete entry with
entret using the AC code I got the following error message:

claudia at pc-31-18-86-200:~> seqret
embldnahs:chromosome:NCBI36:1:1000001:1970768:1
Reads and writes (returns) sequences
Error: Unable to read sequence
'embldnahs:chromosome:NCBI36:1:1000001:1970768:1'
Died: seqret terminated: Bad value for '-sequence' and no prompt

I think that the format used by Ensembl for assigning IDs and ACCs is
causing the problems. For example the first entry from the flat file:

claudia at pc-31-18-86-200:~> head
/local/bioinfo/db/ensembl/embl/Homo_sapiens.0.dat
ID   1    standard; DNA; HTG; 970768 BP.
XX
AC   chromosome:NCBI36:1:1000001:1970768:1
XX
SV   chromosome:NCBI36:1:1000001:1970768:1
XX
DT   5-OCT-2006
XX
DE   Homo sapiens chromosome 1 NCBI36 partial sequence 1000001..1970768
DE   annotated by Ensembl

I tried replacing the ":" character of the AC line with a "_" using sed
but after indexing and I get the same error message with seqret or
entret. Is there any length limit for IDs or ACCs in EMBOSS? Is there
any workaround for this problem?

Thanks

System:
EMBOSS 4.0.0
SUSE 9.3

showdb:
# Name         Type  ID  Qry All Comment
# ============ ==== ==  === === =======
embldnahs      N    OK  OK  OK  Ensembl EMBL DNA H.sapiens

emboss.default

DB embldnahs [
type: N
dir: /local/bioinfo/db/ensembl/embl
method: emblcd
format: embl
file: *.dat
comment: "Ensembl EMBL DNA H.sapiens"]

-------------- next part --------------
A non-text attachment was scrubbed...
Name: david.guzman.vcf
Type: text/x-vcard
Size: 290 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/emboss/attachments/20061120/47582320/attachment-0002.vcf>


More information about the EMBOSS mailing list