[EMBOSS] Indexing the ID field of EMBL-formatted databases.

pmr at ebi.ac.uk pmr at ebi.ac.uk
Fri Jun 1 07:13:47 UTC 2007


Dear Charles,

> I tried to index RepBase today, and I ran into another problem: not only
> the IDs are lower case, but also it does not provide accession numbers.
> I used the following for indexing:
>
> dbxflat -dbname repbase\
> 	-dbresource embl\
> 	-idformat EMBL\
> 	-filenames '*ref'\
> 	-directory .\
> 	-fields id,org,key,des\
> 	-release 12.04
>
> DB repbase [
>   type: N
>   format: embl
>   method: emboss
>   directory: /home/charles/databases/RepBase12.04.embl
>   file: *.ref
>   fields: "id key des org"
>   comment: "Repeats"
> ]
>
> However, seqret still complains that the AC field is not indexed:
>
> gslc12『RepBase12.04.embl』$ seqret repbase:RLTR19_MM
> Reads and writes (returns) sequences
>
>    EMBOSS An error in ajindex.c at line 3027:
> Cannot open param file
> /home/charles/databases/RepBase12.04.embl/repbase.pxac

We have part of the solution. Database definitions can have the extra
attribute

hasaccession: "N"

but this is only used when accessing SRS servers at present (the original
problem was with sequences from the PDB database which has no accessions).
The fields attribute defines additional fields, but cannot "turn off" the
accession index.

We need to test for this in the other access methods in ajseqdb.c

I will do that today. Thanks for the bug (feature) report.

regards,

Peter




More information about the EMBOSS mailing list