[Biojava-dev] newLine is not consistent across platforms

Amr AL-Hossary amr_alhossary at hotmail.com
Fri Sep 16 13:05:38 UTC 2011


Well,
looking at the org.biojava.bio.structure.TestSECalignment..
Using AFPChainXMLParser.fromXML then AFPChainXMLConverter.toXML doesn't
generate the same XML in accordance to the end line delimiter: The first 
implementation validated XML equality using assertEquals(String,String), 
which doesn't -by all means- tolerate any difference in any single 
character.

I agree with Dr. Andreas that we should end the lines using 
printStream.println(), leaving the matter of selecting which line delimiter 
to choose to the system.
The drawback of this approach is we can't guarantee where (on which OS) was 
the XML produced & where will it be consumed in order to be sure that the 
delimiter of choice is \n or \r\n.
So, we need a utility function that asserts equality of XML (if there is not 
one already present in the assertXXX() suit).

So, as a 1ry solution, I made a utility method that compares Strings line by 
line, ignoring the end line delimiter. I used the standard java class 
java.util.Scanner because it tolerate all & every type of line delimiter.
Here is a line of Scanner source code:
private static final String LINE_SEPARATOR_PATTERN = 
"\r\n|[\n\r\u2028\u2029\u0085]";


Please inspect SVN revision 9232 to have my full picture. I welcome all 
comments


Amr
I hope this mail is delivered to the group this time :(

------------------

--------------------------------------------------
From: "Andreas Prlic" <andreas at sdsc.edu>
Sent: Thursday, September 15, 2011 8:57 PM
To: "Scooter Willis" <HWillis at scripps.edu>
Cc: "Amr AL-Hossary" <amr_alhossary at hotmail.com>;
<biojava-dev at lists.open-bio.org>
Subject: Re: [Biojava-dev] newLine is not consistent across platforms

> ok, that sounds like a bug with the genome browser who shall not be
> named. I still think the correct default behaviour for an API is to
> use the system property. The GFF3 export method could allow to work
> around this by getting a flag "use unix style newline". If people
> think this is a problem at more places, we can provide a central
> utility method which could allow to switch the newline across the
> API..
>
> A
>
> On Thu, Sep 15, 2011 at 10:59 AM, Scooter Willis <HWillis at scripps.edu>
> wrote:
>> Andreas
>>
>> You can't win that way either. As an example I think GFF3 file format
>> when
>> used with a unmentioned open source genome browser will only work if \n
>> is
>> the line terminator.
>>
>> Scooter
>>
>> On 9/15/11 1:20 PM, "Andreas Prlic" <andreas at sdsc.edu> wrote:
>>
>>>Hi Amr,
>>>
>>>yea, the newline should never be written as \n or \r\n but requested
>>>from System.getProperty("line.separator"); Did we have many instances
>>>of this? I thought we were pretty consistent in avoiding this.. I am
>>>not sure if we need a central place for this. Perhaps all we need is
>>>to remind all developers to avoid hard coding this and using the
>>>System property instead.
>>>
>>>Andreas
>>>
>>>
>>>2011/9/15 Amr AL-Hossary <amr_alhossary at hotmail.com>:
>>>> Here is another assertion exception:
>>>> The end line delimiter is different across platforms.
>>>> So, I created a new helper class for common String assertion
>>>>manipulation
>>>> tasks.
>>>>
>>>> Please feel free to use it in all common String manipulation tasks.
>>>>
>>>> Well, my question is: where could it be put (a common place) to be used
>>>>by
>>>> all test classes?
>>>>
>>>> Amr
>>>> ------------
>>>> This mail is sent for the 5th time
>>>>
>>>>
>>>_______________________________________________
>>>biojava-dev mailing list
>>>biojava-dev at lists.open-bio.org
>>>http://lists.open-bio.org/mailman/listinfo/biojava-dev
>>
>>
>
>
>
> -- 
> -----------------------------------------------------------------------
> Dr. Andreas Prlic
> Senior Scientist, RCSB PDB Protein Data Bank
> University of California, San Diego
> (+1) 858.246.0526
> -----------------------------------------------------------------------
> 



More information about the biojava-dev mailing list