[Bioperl-l] Trouble with Bio::DB::Fasta and large files

Allen Day allenday@ucla.edu
Wed, 6 Nov 2002 02:45:31 -0800 (PST)


Tyler,

Are you perchance using tcsh?  It could simply be a problem with your 
shell.  This came up on the bioclusters mailing list a while ago:

http://bioinformatics.org/pipermail/bioclusters/2002-May/000220.html

I ran into the problem last week when it appeared gzip wouldn't work for 
me when trying to load a big (human) file into Bio::DB::GFF.  Recompiled 
the shell and it was fine.

-Allen



On Tue, 5 Nov 2002, Lincoln Stein wrote:

> I believe you are hitting the 2 GB file limit on some Unix systems.  In 
> general, you will have to do three things:
> 
> 	1) make sure that your kernel supports large files > 2 Gb
> 	Recompile the kernel if not.
> 
> 	2) make sure that you have a recent version of the C library,
> 	libc, that supports large files.  Install a new one if not (good luck!)
> 
> 	3) make sure that you have a version of Perl that was compiled
> 	with large file support.  Recompile with large file support turned
> 	on if not.
> 
> It's a big pain.  We just had to do this for one of our servers when we 
> experienced a similar problem.
> 
> Lincoln
> 
> On Tuesday 05 November 2002 07:32 pm, Tyler wrote:
> > I have been using Bio::DB::Fasta to extract sequences from fasta BLAST
> > databases for zebrafish and fugu with no problems. I've used both the
> > tied hash and object oriented implementations and they work great with
> > these databases. Thanks Lincoln.
> >
> > However, when trying to use Bio::DB::Fasta on local mouse or human
> > genome databases (ensembl raw data) they throw the "Invalid file or
> > dirname" exception. The mouse fasta file is 2.7GB and the human one is
> > 3.2GB, as opposed to 1.2GB for zebrafish and 340MB for fugu. All
> > scripts are the same except for the name of the database file. All
> > databases work fine with standalone blast (both the web interface and
> > the bioperl interface).
> >
> > Is there a work around for dealing with these extremely large files?
> >
> > -Tyler
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l@bioperl.org
> > http://bioperl.org/mailman/listinfo/bioperl-l
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>