[Bioperl-l] Processing large fasta sequences throught SeqIO

Josep Francesc Abril Ferrando jabril@imim.es
Fri, 31 Aug 2001 15:11:07 +0200


Hi Jason,

> > Error in tempdir() using /tmp/XXXXXXXXXX: Could not create directory
> > /tmp/Z0gD8R0rlB: Too many links at
> > /usr/lib/perl5/site_perl/5.005//Bio/Root/IO.pm line 457
>
> Is your tmp dir really full of files/directories or have not enough space
> for the collection of all the sequence data?  This seems like a system
> problem.

Currently, "/tmp" is only ~150Mb and I have more than 1Gb of free hard disk space (on a PC box with
386Mb of RAM, Red Hat 6.2 with kernel version 2.2.14, and perl 5.6.1). Maybe it could be a
permissions issue.

> Do you have File::Temp installed?  There is a known bug in 0.7 release
> that if you do not have File::Temp installed the application will not
> cleanup its tempdirs/tempfiles cleanly.  Installing File::Temp will take
> care of that.

It is installed and it is version 0.12. Do I have to include the corresponding "use File::Temp;" in
the script ?
Maybe I have to tell our sysadmin to update both, File::Temp and BioPerl.

> > If I look at the saved file, the sequence is OK (do not have more or
> > less nucleotides than expected and they are in the correct ordering)
> > but the file contains a lot of empty lines (or just having '>') after
> > the finished sequence. Any idea of what should be wrong in the
> > following script:
>
> Nothing obvious is jumping out right now by looking at your code -
> How large are your files?

At this moment I am working around 50Mbp length sequences, but I would like being able to scale up
to 250Mbp.

> > Is that the right way to use "Bio::SeqIO" for processing large fasta
> > files. Do I have to include "Bio::Seq::LargeSeq" and, if yes, how can
> > I do that ?
>
> you could add the line
> use Bio::Seq::LargeSeq;
> just below --> use Bio::SeqIO <--
> if you wanted, but it is included by the largefasta modules so it is
> optional.

Well, I've made some test, including "use Bio::Seq::LargeSeq" first and then also with "use
File::Temp", and I've got the same results (the same error/warning -only changing the temporary
directory name that cannot be created- and the same trailing extra lines).

Thanks again... Josep F.

________________________________________

    Josep Francesc ABRIL FERRANDO

RESEARCH GROUP on BIOMEDICAL INFORMATICS
        GENOME INFORMATICS LAB
              IMIM - UPF
          C/ Dr. Aiguader 80
       08003 - Barcelona  (SPAIN)

    Ph:  +34 93 2211009 ext 2016
    Fax: +34 93 2213237

    http://www1.imim.es/~jabril/