[Biojava-dev] newLine is not consistent across platforms

Amr AL-Hossary amr_alhossary at hotmail.com
Sat Sep 17 06:41:18 UTC 2011


Thanks Dr. Andreas.
What I care about now is
Where shall the utility class be put, in order to be utilized by all other 
classes? and what other methods (of common use) should be added?

Amr

--------------------------------------------------
From: "Andreas Prlic" <andreas at sdsc.edu>
Sent: Saturday, September 17, 2011 1:31 AM
To: "Amr AL-Hossary" <amr_alhossary at hotmail.com>
Cc: <biojava-dev at lists.open-bio.org>; "Scooter Willis" <HWillis at scripps.edu>
Subject: Re: [Biojava-dev] newLine is not consistent across platforms

> Hi,
>
> Just to wrap this up:
>
> Amr I believe the use of Scanner is appropriate here and as a result
> your utility follows the XML guidelines, so this is a good fix.
>
> Thanks,
> Andreas
>
>
>
> On Fri, Sep 16, 2011 at 7:27 AM, Andreas Prlic <andreas at sdsc.edu> wrote:
>> I am just taking a look at the XML spec and what they recommend about
>> this issue. I am sure we are not the first to encounter this...
>>
>> http://www.w3.org/TR/REC-xml/#sec-line-ends
>>
>> I haven't looked at your commit yet. I'll comment on this after I have
>> thought about this a bit more.
>>
>> Andreas
>>
>>
>>
>> On Fri, Sep 16, 2011 at 6:05 AM, Amr AL-Hossary
>> <amr_alhossary at hotmail.com> wrote:
>>> Well,
>>> looking at the org.biojava.bio.structure.TestSECalignment..
>>> Using AFPChainXMLParser.fromXML then AFPChainXMLConverter.toXML doesn't
>>> generate the same XML in accordance to the end line delimiter: The first
>>> implementation validated XML equality using assertEquals(String,String),
>>> which doesn't -by all means- tolerate any difference in any single
>>> character.
>>>
>>> I agree with Dr. Andreas that we should end the lines using
>>> printStream.println(), leaving the matter of selecting which line 
>>> delimiter
>>> to choose to the system.
>>> The drawback of this approach is we can't guarantee where (on which OS) 
>>> was
>>> the XML produced & where will it be consumed in order to be sure that 
>>> the
>>> delimiter of choice is \n or \r\n.
>>> So, we need a utility function that asserts equality of XML (if there is 
>>> not
>>> one already present in the assertXXX() suit).
>>>
>>> So, as a 1ry solution, I made a utility method that compares Strings 
>>> line by
>>> line, ignoring the end line delimiter. I used the standard java class
>>> java.util.Scanner because it tolerate all & every type of line 
>>> delimiter.
>>> Here is a line of Scanner source code:
>>> private static final String LINE_SEPARATOR_PATTERN =
>>> "\r\n|[\n\r\u2028\u2029\u0085]";
>>>
>>>
>>> Please inspect SVN revision 9232 to have my full picture. I welcome all
>>> comments
>>>
>>>
>>> Amr
>>> I hope this mail is delivered to the group this time :(
>>>
>>> ------------------
>>>
>>> --------------------------------------------------
>>> From: "Andreas Prlic" <andreas at sdsc.edu>
>>> Sent: Thursday, September 15, 2011 8:57 PM
>>> To: "Scooter Willis" <HWillis at scripps.edu>
>>> Cc: "Amr AL-Hossary" <amr_alhossary at hotmail.com>;
>>> <biojava-dev at lists.open-bio.org>
>>> Subject: Re: [Biojava-dev] newLine is not consistent across platforms
>>>
>>>> ok, that sounds like a bug with the genome browser who shall not be
>>>> named. I still think the correct default behaviour for an API is to
>>>> use the system property. The GFF3 export method could allow to work
>>>> around this by getting a flag "use unix style newline". If people
>>>> think this is a problem at more places, we can provide a central
>>>> utility method which could allow to switch the newline across the
>>>> API..
>>>>
>>>> A
>>>>
>>>> On Thu, Sep 15, 2011 at 10:59 AM, Scooter Willis <HWillis at scripps.edu>
>>>> wrote:
>>>>>
>>>>> Andreas
>>>>>
>>>>> You can't win that way either. As an example I think GFF3 file format
>>>>> when
>>>>> used with a unmentioned open source genome browser will only work if 
>>>>> \n
>>>>> is
>>>>> the line terminator.
>>>>>
>>>>> Scooter
>>>>>
>>>>> On 9/15/11 1:20 PM, "Andreas Prlic" <andreas at sdsc.edu> wrote:
>>>>>
>>>>>> Hi Amr,
>>>>>>
>>>>>> yea, the newline should never be written as \n or \r\n but requested
>>>>>> from System.getProperty("line.separator"); Did we have many instances
>>>>>> of this? I thought we were pretty consistent in avoiding this.. I am
>>>>>> not sure if we need a central place for this. Perhaps all we need is
>>>>>> to remind all developers to avoid hard coding this and using the
>>>>>> System property instead.
>>>>>>
>>>>>> Andreas
>>>>>>
>>>>>>
>>>>>> 2011/9/15 Amr AL-Hossary <amr_alhossary at hotmail.com>:
>>>>>>>
>>>>>>> Here is another assertion exception:
>>>>>>> The end line delimiter is different across platforms.
>>>>>>> So, I created a new helper class for common String assertion
>>>>>>> manipulation
>>>>>>> tasks.
>>>>>>>
>>>>>>> Please feel free to use it in all common String manipulation tasks.
>>>>>>>
>>>>>>> Well, my question is: where could it be put (a common place) to be 
>>>>>>> used
>>>>>>> by
>>>>>>> all test classes?
>>>>>>>
>>>>>>> Amr
>>>>>>> ------------
>>>>>>> This mail is sent for the 5th time
>>>>>>>
>>>>>>>
>>>>>> _______________________________________________
>>>>>> biojava-dev mailing list
>>>>>> biojava-dev at lists.open-bio.org
>>>>>> http://lists.open-bio.org/mailman/listinfo/biojava-dev
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>
> _______________________________________________
> biojava-dev mailing list
> biojava-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biojava-dev
> 



More information about the biojava-dev mailing list