[BioRuby] Read/write of simple fasta file increases the identifier.

Tomoaki NISHIYAMA tomoakin at kenroku.kanazawa-u.ac.jp
Tue Apr 21 08:29:26 UTC 2009


Hi,

With bioruby-1.3.0, reading a fasta file and convert to Bio::Sequence  
with
to_seq and then write with output(:fasta)
causes the definition being longer each time the file is processed...

Is there a better interface to keep the definition line as the original?
Perhaps this is because, in some format there are cleary separate
entry_id and definition, while it is ambiguous in FASTA format.
However, it is better to be able to easily recover the original
definition line.

The following is the simple example case.
Surely this is quite simple that the entry need not converted
to Bio::Sequence, but in fact I would like to manipulate the sequence
such as getting subsequence or adding some sequence, translate, etc...

% cat > simple_fasta
 >abc
acgttgac
% cat fastacat
#!/usr/local/bin/ruby
require 'bio'
ff = Bio::FlatFile.open(Bio::FastaFormat, ARGF)
while fe = ff.next_entry
   seq = fe.to_seq;
   puts seq.output(:fasta)
end
% ruby fastacat simple_fasta
 >abc abc
acgttgac
% ruby fastacat simple_fasta | ruby fastacat
 >abc abc abc
acgttgac

Sincerely yours,

-- 
Tomoaki NISHIYAMA

Advanced Science Research Center,
Kanazawa University,
13-1 Takara-machi,
Kanazawa, 920-0934, Japan





More information about the BioRuby mailing list