[BioRuby] problem while handling large fasta files

K. Patil kpatil at science.uva.nl
Thu Sep 4 12:02:19 UTC 2008


Hi,

I am trying to do some simple processing on fasta files. It works file for
small files (upto several MB). But as soon as I move to very large files
(e.g. 2.2 GB) the program crashes. Any help/suggestions highly
appreciated.

Best regards,
Kaustubh Patil

I am pasting a very simple example below (the file is 2.2GB);

irb(main):021:0> fasta = Bio::FastaFormat.open("9606.2.fna")
=> #<Bio::FlatFile:0x2b2484e9c4a0
@splitter=#<Bio::FlatFile::Splitter::Default:0x2b2484e9a420
@stream=#<Bio::FlatFile::BufferedInputStream:0x2b2484e9c158
@io=#<Bio::FlatFile::BufferedInputStream:0x2b2484e9c3b0
@io=#<File:9606.2.fna>, @buffer="", @path="9606.2.fna">,
@buffer=">9606.2.fna\ntaaccctaaccctaaccctaaccctaaccctaaccctaaccctaaccctaaccctaaccctaaccctaaccctaaccctaaccctaaccctaaccctaac\n",
@path="9606.2.fna">, @header=nil, @delimiter="\n>", @delimiter_overrun=1>,
@firsttime_flag=true,
@stream=#<Bio::FlatFile::BufferedInputStream:0x2b2484e9c158
@io=#<Bio::FlatFile::BufferedInputStream:0x2b2484e9c3b0
@io=#<File:9606.2.fna>, @buffer="", @path="9606.2.fna">,
@buffer=">9606.2.fna\ntaaccctaaccctaaccctaaccctaaccctaaccctaaccctaaccctaaccctaaccctaaccctaaccctaaccctaaccctaaccctaaccctaac\n",
@path="9606.2.fna">, @skip_leader_mode=:firsttime, @raw=false,
@dbclass=Bio::FastaFormat>
irb(main):022:0> fasta.each do |seq|
irb(main):023:1* print seq.data
irb(main):024:1> end
NoMethodError: private method `sub' called for nil:NilClass
        from /usr/lib/ruby/1.8/bio/db/fasta.rb:156:in `initialize'
        from /usr/lib/ruby/1.8/bio/io/flatfile.rb:579:in `new'
        from /usr/lib/ruby/1.8/bio/io/flatfile.rb:579:in `next_entry'
        from /usr/lib/ruby/1.8/bio/io/flatfile.rb:609:in `each'
        from (irb):22





More information about the BioRuby mailing list