[Bioperl-l] Root::IO handle Mac and Win32 LF
Allen Day
allenday at ucla.edu
Tue Dec 16 12:36:23 EST 2003
> Adding this to newick.pm after the record is slurped in takes
> care of the problem:
> s/[\n\r]+//g
>
> As any sort of newline needs to be stripped out as that is what is
> getting converted to spaces. It really wasn't a windows problem but
> a problem with Allen's changes to the newick parsing code to replace WS
> with _ but not handling LF separately.
>
> >From the log:
>
> revision 1.22
> date: 2003/08/15 17:07:27; author: allenday; state: Exp; lines: +3 -2
> removed unnecessary escap char in space removing regex. added regex to
> remove quotes and leading/trailing spaces
> from node labels as necessary.
> ----------------------------
> revision 1.21
> date: 2003/08/15 08:31:46; author: allenday; state: Exp; lines: +5 -2
> fixing over-zealous whitespace removal from node labels. we do this by
> not tampering with " quoted strings. i'm not sure if newick allows " to
> be escaped within these labels... if so, there may be a bug here.
> ----------------------------
>
> My original code stripped all whitespace and thus we never had this
> problem because there shouldn't be any in the node names in Newick
> http://evolution.genetics.washington.edu/phylip/newicktree.html
> "A name can be any string of printable characters except --->blanks<---,
> colons, semcolons, parentheses, and square brackets."
>
> but apparently he wants to support this for his purposes.
Yes, I have had to parse newick files that do contain spaces in node
names. I'd like to preserve these in the input. I think it would be a
good idea when writing a tree to throw an error and/or remove any illegal
characters (blanks, colons, semicolons, etc), but at the time of
modification I didn't have to deal with writing trees.
-allen
More information about the Bioperl-l
mailing list