[BioRuby] Benchmarking FASTA file parsing

Tomoaki NISHIYAMA tomoakin at kenroku.kanazawa-u.ac.jp
Sun Aug 15 06:19:03 UTC 2010


Hi,

> 1. Adjustment of file position.
> The separator used to read a fasta entry is "\n>", but the ">"
> should be belonging to the next entry. To adjust this, the last
> ">" is stored to @entry_overrun. The Bio::FlatFile wrapper will
> use the content of @entry_overrun in the next time of reading.

I first thought as such, but I could not find the code that actually
use it.  Could you specify where it is used?

I could find only several places defining it.
Maybe there was a reformation of Flatfile buffering to use ungets
but not entry_overrun?

#at bioruby/lib/bio/
$ grep entry_overrun * */* */*/* */*/*/*
db/fasta.rb:#    attr_reader :entry_overrun
db/fasta.rb:#      @entry_overrun = $&
db/fastq.rb:  # entry_overrun
db/fastq.rb:  attr_reader :entry_overrun
db/fastq.rb:    @entry_overrun = sc.rest
db/nbrf.rb:      @entry_overrun = $&
db/nbrf.rb:    attr_reader :entry_overrun
db/newick.rb:      @entry_overrun = $1
db/newick.rb:    attr_reader :entry_overrun
appl/blast/format0.rb:          @entry_overrun = $1
appl/blast/format0.rb:        attr_reader :entry_overrun
appl/blast/rpsblast.rb:        @entry_overrun = $1
appl/fasta/format10.rb:      @entry_overrun = overruns.join('')
appl/fasta/format10.rb:  attr_reader :entry_overrun
appl/spidey/report.rb:        @entry_overrun = $1
appl/spidey/report.rb:      attr_reader :entry_overrun

-- 
Tomoaki NISHIYAMA

Advanced Science Research Center,
Kanazawa University,
13-1 Takara-machi,
Kanazawa, 920-0934, Japan




More information about the BioRuby mailing list