[Bioperl-l] Root::IO handle Mac and Win32 LF

Allen Day allenday at ucla.edu
Tue Dec 16 12:36:23 EST 2003


> Adding this to newick.pm after the record is slurped in takes
> care of the problem:
>  s/[\n\r]+//g
> 
> As any sort of newline needs to be stripped out as that is what is
> getting converted to spaces.  It really wasn't a windows problem but
> a problem with Allen's changes to the newick parsing code to replace WS
> with _ but not handling LF separately.
> 
> >From the log:
> 
> revision 1.22
> date: 2003/08/15 17:07:27;  author: allenday;  state: Exp;  lines: +3 -2
> removed unnecessary escap char in space removing regex.  added regex to
> remove quotes and leading/trailing spaces
> from node labels as necessary.
> ----------------------------
> revision 1.21
> date: 2003/08/15 08:31:46;  author: allenday;  state: Exp;  lines: +5 -2
> fixing over-zealous whitespace removal from node labels.  we do this by
> not tampering with " quoted strings.  i'm not sure if newick allows " to
> be escaped within these labels... if so, there may be a bug here.
> ----------------------------
> 
> My original code stripped all whitespace and thus we never had this
> problem because there shouldn't be any in the node names in Newick
> http://evolution.genetics.washington.edu/phylip/newicktree.html
>  "A name can be any string of printable characters except --->blanks<---,
>  colons, semcolons, parentheses, and square brackets."
> 
> but apparently he wants to support this for his purposes.

Yes, I have had to parse newick files that do contain spaces in node
names.  I'd like to preserve these in the input.  I think it would be a
good idea when writing a tree to throw an error and/or remove any illegal
characters (blanks, colons, semicolons, etc), but at the time of
modification I didn't have to deal with writing trees.

-allen



More information about the Bioperl-l mailing list