[EMBOSS] problems installing/using TrEMBL

Simon Andrews simon.andrews at bbsrc.ac.uk
Wed Oct 3 07:37:53 UTC 2007


On 2 Oct 2007, at 18:54, Fernan Aguero wrote:

> Hi,
>
> I've installed TrEMBL in EMBOSS and it seems like I'm having some
> problems ...
>
> I've run dbiflat as follows:
[snip]
>
> Now, when using seqret, it seems like I'm not getting the
> records I expect, for example if I search for the first ID
> in the example above (A0B532), I get A0BDZ0 instead:

I suspect your problem is that your trembl file is >2Gb in size.   
Above this size dbiflat won't work properly and will give wacky  
results such as the ones you've shown.  This won't be a problem with  
uniprot_sprot.dat as this is still only about 1.1Gb.

Your choices are therefore:

1) You could split your trembl file into multiple files, each smaller  
than 2Gb.  This ends up being a complete pain, and you probably don't  
want to do it this way.

2) Use the newer dbx* family of indexing programs which can cope with  
larger file sizes.  In your case you'd use dbxflat instead of  
dbiflat.  There are some configuration differences between the two so  
you should read 'tfm dbxflat' first, but they work pretty much the  
same as the old versions.  We use the dbx programs for all of our  
databases and they work fine.

Hope this helps

Simon.




More information about the EMBOSS mailing list