[EMBOSS] Memory problem with extractseq
Peter Rice
pmr at ebi.ac.uk
Thu Mar 18 12:39:28 UTC 2010
On 18/03/10 09:11, michael watson (IAH-C) wrote:
> Hi
>
> I'm using EMBOSS 6.1.0 on a fairly small Linux VM which has about 3Gb of RAM.
>
> I find it strange that extractseq reports a memory problem:
>
> -bash-3.00# /usr/local/EMBOSS-6.1.0/bin/extractseq -sequence chr1.fasta -outseq chr1_.1.fasta -regions '34415690-34415711'
> Extract regions from a sequence
> Uncaught exception: Allocation failed, insufficient memory available, raised at ajstr.c:2406
>
> Whereas if I write a Bioperl script using SeqIO and the trunk() function, it works perfectly.
>
> I'd have thought EMBOSS would be more streamlined and memory efficient than Bioperl?
It appears to be in the buffering of input to detect the format.
While we try to improve the performance, you can simply specify the format:
-sformat fasta
to turn off the file input buffering.
Reading an unknown format requires a lot of input to be buffered, in
case a GCG ".." checksum line appears.
Hope that helps
Peter
More information about the EMBOSS
mailing list