[BioRuby] problem while handling large fasta files

Ben Woodcroft donttrustben at gmail.com
Fri Sep 5 13:12:18 UTC 2008


Or you could use the RUBYLIB environment variable - set it to your bioruby
lib/ directory and then you don't have to modify your scripts at all. The
advantage of doing this is that your choice of gem/github bioruby version
doesn't impact your scripts at all, and so when you change it is much
easier.

2008/9/5 Naohisa GOTO <ngoto at gen-info.osaka-u.ac.jp>

> On Thu, 4 Sep 2008 15:32:27 +0200 (CEST)
> "K. Patil" <kpatil at science.uva.nl> wrote:
>
> > Oops, sorry for incomplete information. Here it is;
> >
> > Ruby: 1.8
> > Bioruby: 1.0.0
> > OS/CPU: 2.6.24.2.1.amd64-smp #1 SMP Mon Feb 11 12:43:21 UTC 2008 x86_64
> > GNU/Linux
>
> The BioRuby 1.0.0 is too old!
>
> The only thing I can say is the problem may not occur
> in the latest version of BioRuby, at least 1.2.1.
>
> > Also I cannot upgrade Ruby/Bioruby easily as I don't have appropriate
> > permissions (all packages are installed by the administrator on request).
>
> BioRuby (and also Ruby) can be installed in your home directory,
> without root (administrator) permission.
>
> The simplest way is:
>
>  % cd somewhere
>  % wget http://bioruby.open-bio.org/archive/bioruby-1.2.1.tar.gz
>  % tar zxvf bioruby-1.2.1.tar.gz
>
> And then, when running your script,
>
>  % ruby -I /full/path/to/somewhere/bioruby-1.2.1/lib example.rb
>  (The "/full/path/to/somewhere" is the path you extracted
>  the bioruby archive.)
>
> If you want to use irb,
>
>  % ruby -I /full/path/to/somewhere/bioruby-1.2.1/lib -r bio
>
> Alternatively, put
>
>  $LOAD_PATH.unshift("/full/path/to/somewhere/bioruby-1.2.1/lib")
>
> before the require 'bio' in your script.
>
>
> Naohisa Goto
> ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org
>
>
> >
> > thanks and regards,
> > kaustubh
> >
> >
> > > Hi,
> > >
> > > Please show which BioRuby version, Ruby version, OS,
> > > architecture (type of CPU) you are using.
> > >
> > > Is the Ruby and/or BioRuby version older?
> > >
> > > Naohisa Goto
> > > ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org
> > >
> > > On Thu, 4 Sep 2008 14:02:19 +0200 (CEST)
> > > "K. Patil" <kpatil at science.uva.nl> wrote:
> > >
> > >> Hi,
> > >>
> > >> I am trying to do some simple processing on fasta files. It works file
> > >> for
> > >> small files (upto several MB). But as soon as I move to very large
> files
> > >> (e.g. 2.2 GB) the program crashes. Any help/suggestions highly
> > >> appreciated.
> > >>
> > >> Best regards,
> > >> Kaustubh Patil
> > >>
> > >> I am pasting a very simple example below (the file is 2.2GB);
> > >>
> > >> irb(main):021:0> fasta = Bio::FastaFormat.open("9606.2.fna")
> > >> => #<Bio::FlatFile:0x2b2484e9c4a0
> > >> @splitter=#<Bio::FlatFile::Splitter::Default:0x2b2484e9a420
> > >> @stream=#<Bio::FlatFile::BufferedInputStream:0x2b2484e9c158
> > >> @io=#<Bio::FlatFile::BufferedInputStream:0x2b2484e9c3b0
> > >> @io=#<File:9606.2.fna>, @buffer="", @path="9606.2.fna">,
> > >>
> @buffer=">9606.2.fna\ntaaccctaaccctaaccctaaccctaaccctaaccctaaccctaaccctaaccctaaccctaaccctaaccctaaccctaaccctaaccctaaccctaac\n",
> > >> @path="9606.2.fna">, @header=nil, @delimiter="\n>",
> > >> @delimiter_overrun=1>,
> > >> @firsttime_flag=true,
> > >> @stream=#<Bio::FlatFile::BufferedInputStream:0x2b2484e9c158
> > >> @io=#<Bio::FlatFile::BufferedInputStream:0x2b2484e9c3b0
> > >> @io=#<File:9606.2.fna>, @buffer="", @path="9606.2.fna">,
> > >>
> @buffer=">9606.2.fna\ntaaccctaaccctaaccctaaccctaaccctaaccctaaccctaaccctaaccctaaccctaaccctaaccctaaccctaaccctaaccctaaccctaac\n",
> > >> @path="9606.2.fna">, @skip_leader_mode=:firsttime, @raw=false,
> > >> @dbclass=Bio::FastaFormat>
> > >> irb(main):022:0> fasta.each do |seq|
> > >> irb(main):023:1* print seq.data
> > >> irb(main):024:1> end
> > >> NoMethodError: private method `sub' called for nil:NilClass
> > >>         from /usr/lib/ruby/1.8/bio/db/fasta.rb:156:in `initialize'
> > >>         from /usr/lib/ruby/1.8/bio/io/flatfile.rb:579:in `new'
> > >>         from /usr/lib/ruby/1.8/bio/io/flatfile.rb:579:in `next_entry'
> > >>         from /usr/lib/ruby/1.8/bio/io/flatfile.rb:609:in `each'
> > >>         from (irb):22
> > >>
> > >>
> > >> _______________________________________________
> > >> BioRuby mailing list
> > >> BioRuby at lists.open-bio.org
> > >> http://lists.open-bio.org/mailman/listinfo/bioruby
> > >
> > >
> > >
> >
> >
>
> _______________________________________________
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby
>



-- 
FYI: My email addresses at unimelb, uq and gmail all redirect to the same
place.



More information about the BioRuby mailing list