[BioRuby] Read/write of simple fasta file increases the identifier.

Tomoaki NISHIYAMA tomoakin at kenroku.kanazawa-u.ac.jp
Tue Apr 21 08:29:26 UTC 2009


With bioruby-1.3.0, reading a fasta file and convert to Bio::Sequence  
to_seq and then write with output(:fasta)
causes the definition being longer each time the file is processed...

Is there a better interface to keep the definition line as the original?
Perhaps this is because, in some format there are cleary separate
entry_id and definition, while it is ambiguous in FASTA format.
However, it is better to be able to easily recover the original
definition line.

The following is the simple example case.
Surely this is quite simple that the entry need not converted
to Bio::Sequence, but in fact I would like to manipulate the sequence
such as getting subsequence or adding some sequence, translate, etc...

% cat > simple_fasta
% cat fastacat
require 'bio'
ff = Bio::FlatFile.open(Bio::FastaFormat, ARGF)
while fe = ff.next_entry
   seq = fe.to_seq;
   puts seq.output(:fasta)
% ruby fastacat simple_fasta
 >abc abc
% ruby fastacat simple_fasta | ruby fastacat
 >abc abc abc

Sincerely yours,


Advanced Science Research Center,
Kanazawa University,
13-1 Takara-machi,
Kanazawa, 920-0934, Japan

More information about the BioRuby mailing list