[Bioperl-l] Processing large fasta sequences throught SeqIO
Josep Francesc Abril Ferrando
Fri, 31 Aug 2001 15:11:07 +0200
> > Error in tempdir() using /tmp/XXXXXXXXXX: Could not create directory
> > /tmp/Z0gD8R0rlB: Too many links at
> > /usr/lib/perl5/site_perl/5.005//Bio/Root/IO.pm line 457
> Is your tmp dir really full of files/directories or have not enough space
> for the collection of all the sequence data? This seems like a system
Currently, "/tmp" is only ~150Mb and I have more than 1Gb of free hard disk space (on a PC box with
386Mb of RAM, Red Hat 6.2 with kernel version 2.2.14, and perl 5.6.1). Maybe it could be a
> Do you have File::Temp installed? There is a known bug in 0.7 release
> that if you do not have File::Temp installed the application will not
> cleanup its tempdirs/tempfiles cleanly. Installing File::Temp will take
> care of that.
It is installed and it is version 0.12. Do I have to include the corresponding "use File::Temp;" in
the script ?
Maybe I have to tell our sysadmin to update both, File::Temp and BioPerl.
> > If I look at the saved file, the sequence is OK (do not have more or
> > less nucleotides than expected and they are in the correct ordering)
> > but the file contains a lot of empty lines (or just having '>') after
> > the finished sequence. Any idea of what should be wrong in the
> > following script:
> Nothing obvious is jumping out right now by looking at your code -
> How large are your files?
At this moment I am working around 50Mbp length sequences, but I would like being able to scale up
> > Is that the right way to use "Bio::SeqIO" for processing large fasta
> > files. Do I have to include "Bio::Seq::LargeSeq" and, if yes, how can
> > I do that ?
> you could add the line
> use Bio::Seq::LargeSeq;
> just below --> use Bio::SeqIO <--
> if you wanted, but it is included by the largefasta modules so it is
Well, I've made some test, including "use Bio::Seq::LargeSeq" first and then also with "use
File::Temp", and I've got the same results (the same error/warning -only changing the temporary
directory name that cannot be created- and the same trailing extra lines).
Thanks again... Josep F.
Josep Francesc ABRIL FERRANDO
RESEARCH GROUP on BIOMEDICAL INFORMATICS
GENOME INFORMATICS LAB
IMIM - UPF
C/ Dr. Aiguader 80
08003 - Barcelona (SPAIN)
Ph: +34 93 2211009 ext 2016
Fax: +34 93 2213237