[EMBOSS] can't access databases indexed by dbifasta
Peter Rice
pmr at ebi.ac.uk
Mon Nov 8 13:39:25 UTC 2004
An old bug reoprt that was fixed at the time (by using a different database
name) ... now we know the real reason.
The failed database name was ecoli.nt
Database names can only contain letters, numbers or underscores.
The '.' in the database name means that EMBOSS fails to read it as a database
name. It can read it as a file - this is why it appears to work if you are in
the same directory as the database - because there is a file called ecoli.nt
in that directory, and this is what EMBOSS is reading.
In the next release, database names with '.' will cause a warning message when
the emboss.default and .embossrc files are read.
Also in the next release, database names must be at least 2 characters long
(so we can use Windows filenaming conventions on windows systems). I doubt
whether anyone is using single letter database names, but there may be smoe
"test databases" defined which will cause a warning message in the next release.
Marcus Claesson wrote:
> Hello,
>
> I have a silly little problem indexing databases in Emboss-2.8.0. After
> running dbifasta and adding DB entries in emboss.default I can only
> access the database when being in the same directory as the fasta file.
> Here is what I did:
>
> [blast_db]$ uname -a
> Linux neo.ucc.ie 2.4.9-e.35enterprise #1 SMP Tue Dec 23 00:06:16 EST
> 2003 i686 unknown
>
> [blast_db]$ pwd
> /var/data/blast_db
>
> [blast_db]$ ll ecoli.nt
> -rw-r--r-- 1 marcus bioinfo 4763013 Jan 15 01:38 ecoli.nt
>
> [blast_db]$ dbifasta
> Index a fasta database
> simple : >ID
> idacc : >ID ACC
> gcgid : >db:ID
> gcgidacc : >db:ID ACC
> dbid : >db ID
> ncbi : | formats
> ID line format [idacc]:
> Database directory [.]: /var/data/blast_db
> Wildcard database filename [*.dat]: ecoli.nt
> Database name: ecoli.nt
> Release number [0.0]:
> Index date [00/00/00]:
>
> [blast_db]$ ll entrynam.idx division.lkp acnum.*
> -rw-rw-r-- 1 marcus bioinfo 300 Feb 17 10:39 acnum.hit
> -rw-rw-r-- 1 marcus bioinfo 300 Feb 17 10:39 acnum.trg
> -rw-rw-r-- 1 marcus bioinfo 330 Feb 17 10:39 division.lkp
> -rw-rw-r-- 1 marcus bioinfo 300 Feb 17 10:39 entrynam.idx
>
> Added these lines in /usr/local/EMBOSS-2.8.0/emboss/emboss.default:
>
> DB ecoli.nt [
> type: "N"
> format: "fasta"
> method: "emblcd"
> dir: "/var/data/blast_db/"
> ]
>
> [blast_db]$ showdb
> Displays information on the currently available databases
> # Name Type ID Qry All Comment
> # ==== ==== == === === =======
> ecoli.nt N OK OK OK -
>
> [blast_db]$ cd ~
>
> [marcus]$ seqret ecoli.nt
> Reads and writes (returns) sequences
> Error: failed to open filename 'ecoli.nt'
> Error: Unable to read sequence 'ecoli.nt'
> Died: seqret terminated: Bad value for '-sequence' and no prompt
>
> But it works when I'm the same directory as ecoli.nt:
>
> [blast_db]$ seqret ecoli.nt
> Reads and writes (returns) sequences
> Output sequence [ae000111.fasta]:
> etc...
>
> Clearly it must be possible to access ecoli.nt from other directories?
>
>
> Extremly grateful for any help on this!
>
> Regards,
> Marcus
>
>
>
More information about the EMBOSS
mailing list