[Biojava-dev] newLine is not consistent across platforms

Andreas Prlic andreas at sdsc.edu
Fri Sep 16 23:31:47 UTC 2011


Hi,

Just to wrap this up:

Amr I believe the use of Scanner is appropriate here and as a result
your utility follows the XML guidelines, so this is a good fix.

Thanks,
Andreas



On Fri, Sep 16, 2011 at 7:27 AM, Andreas Prlic <andreas at sdsc.edu> wrote:
> I am just taking a look at the XML spec and what they recommend about
> this issue. I am sure we are not the first to encounter this...
>
> http://www.w3.org/TR/REC-xml/#sec-line-ends
>
> I haven't looked at your commit yet. I'll comment on this after I have
> thought about this a bit more.
>
> Andreas
>
>
>
> On Fri, Sep 16, 2011 at 6:05 AM, Amr AL-Hossary
> <amr_alhossary at hotmail.com> wrote:
>> Well,
>> looking at the org.biojava.bio.structure.TestSECalignment..
>> Using AFPChainXMLParser.fromXML then AFPChainXMLConverter.toXML doesn't
>> generate the same XML in accordance to the end line delimiter: The first
>> implementation validated XML equality using assertEquals(String,String),
>> which doesn't -by all means- tolerate any difference in any single
>> character.
>>
>> I agree with Dr. Andreas that we should end the lines using
>> printStream.println(), leaving the matter of selecting which line delimiter
>> to choose to the system.
>> The drawback of this approach is we can't guarantee where (on which OS) was
>> the XML produced & where will it be consumed in order to be sure that the
>> delimiter of choice is \n or \r\n.
>> So, we need a utility function that asserts equality of XML (if there is not
>> one already present in the assertXXX() suit).
>>
>> So, as a 1ry solution, I made a utility method that compares Strings line by
>> line, ignoring the end line delimiter. I used the standard java class
>> java.util.Scanner because it tolerate all & every type of line delimiter.
>> Here is a line of Scanner source code:
>> private static final String LINE_SEPARATOR_PATTERN =
>> "\r\n|[\n\r\u2028\u2029\u0085]";
>>
>>
>> Please inspect SVN revision 9232 to have my full picture. I welcome all
>> comments
>>
>>
>> Amr
>> I hope this mail is delivered to the group this time :(
>>
>> ------------------
>>
>> --------------------------------------------------
>> From: "Andreas Prlic" <andreas at sdsc.edu>
>> Sent: Thursday, September 15, 2011 8:57 PM
>> To: "Scooter Willis" <HWillis at scripps.edu>
>> Cc: "Amr AL-Hossary" <amr_alhossary at hotmail.com>;
>> <biojava-dev at lists.open-bio.org>
>> Subject: Re: [Biojava-dev] newLine is not consistent across platforms
>>
>>> ok, that sounds like a bug with the genome browser who shall not be
>>> named. I still think the correct default behaviour for an API is to
>>> use the system property. The GFF3 export method could allow to work
>>> around this by getting a flag "use unix style newline". If people
>>> think this is a problem at more places, we can provide a central
>>> utility method which could allow to switch the newline across the
>>> API..
>>>
>>> A
>>>
>>> On Thu, Sep 15, 2011 at 10:59 AM, Scooter Willis <HWillis at scripps.edu>
>>> wrote:
>>>>
>>>> Andreas
>>>>
>>>> You can't win that way either. As an example I think GFF3 file format
>>>> when
>>>> used with a unmentioned open source genome browser will only work if \n
>>>> is
>>>> the line terminator.
>>>>
>>>> Scooter
>>>>
>>>> On 9/15/11 1:20 PM, "Andreas Prlic" <andreas at sdsc.edu> wrote:
>>>>
>>>>> Hi Amr,
>>>>>
>>>>> yea, the newline should never be written as \n or \r\n but requested
>>>>> from System.getProperty("line.separator"); Did we have many instances
>>>>> of this? I thought we were pretty consistent in avoiding this.. I am
>>>>> not sure if we need a central place for this. Perhaps all we need is
>>>>> to remind all developers to avoid hard coding this and using the
>>>>> System property instead.
>>>>>
>>>>> Andreas
>>>>>
>>>>>
>>>>> 2011/9/15 Amr AL-Hossary <amr_alhossary at hotmail.com>:
>>>>>>
>>>>>> Here is another assertion exception:
>>>>>> The end line delimiter is different across platforms.
>>>>>> So, I created a new helper class for common String assertion
>>>>>> manipulation
>>>>>> tasks.
>>>>>>
>>>>>> Please feel free to use it in all common String manipulation tasks.
>>>>>>
>>>>>> Well, my question is: where could it be put (a common place) to be used
>>>>>> by
>>>>>> all test classes?
>>>>>>
>>>>>> Amr
>>>>>> ------------
>>>>>> This mail is sent for the 5th time
>>>>>>
>>>>>>
>>>>> _______________________________________________
>>>>> biojava-dev mailing list
>>>>> biojava-dev at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/biojava-dev
>>>>
>>>>
>>>
>>>
>>>
>



More information about the biojava-dev mailing list