[BioRuby] newick gsub(' ','_')

Naohisa Goto ngoto at gen-info.osaka-u.ac.jp
Mon Jul 20 07:50:09 UTC 2009


Hi,

> Hi,
> 
> According to the specification of NEWICK at
> 
> http://evolution.genetics.washington.edu/phylip/newick_doc.html
> 
> SPACE in quoted string and underscore are regarded to be
> identical.
> 
> In the note it reads
> "Underscore characters in unquoted labels are converted to blanks. "
> 
> OTU label in MacClade and PAUP behaves similarly.
> So, surrounding with single quote or replacing space with underscore
> are both conforming representation.

Newick formatter in BioRuby converts spaces in a label if the label
can be treated as "unquoted labels" i.e. it consists of only alphabets,
numbers and/or spaces.

I believe the behavior is right, although I know some software
ignore the underscore rule.  When parsing Newick format, giving
:parser => :naive option to Bio::Newick.new() can prevent any
label character conversion, but no option for the output, because
I think genarating broken format is generally a bad thing.

Note that the behavior has been changed in BioRuby 1.2.0.
 Before 1.1.x, it did not care anything about label characters.


> -- 
> Tomoaki NISHIYAMA
> 
> Advanced Science Research Center,
> Kanazawa University,
> 13-1 Takara-machi,
> Kanazawa, 920-0934, Japan
> 

Thank you.


Naohisa Goto
ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org





More information about the BioRuby mailing list