[Biojava-dev] newLine is not consistent across platforms

Andreas Prlic andreas at sdsc.edu
Sun Sep 18 23:43:18 UTC 2011


Hi Amr,

org.biojava3.core.util

is a good place for utility methods...

Andreas

On Fri, Sep 16, 2011 at 11:41 PM, Amr AL-Hossary
<amr_alhossary at hotmail.com> wrote:
> Thanks Dr. Andreas.
> What I care about now is
> Where shall the utility class be put, in order to be utilized by all other
> classes? and what other methods (of common use) should be added?
>
> Amr
>
> --------------------------------------------------
> From: "Andreas Prlic" <andreas at sdsc.edu>
> Sent: Saturday, September 17, 2011 1:31 AM
> To: "Amr AL-Hossary" <amr_alhossary at hotmail.com>
> Cc: <biojava-dev at lists.open-bio.org>; "Scooter Willis" <HWillis at scripps.edu>
> Subject: Re: [Biojava-dev] newLine is not consistent across platforms
>
>> Hi,
>>
>> Just to wrap this up:
>>
>> Amr I believe the use of Scanner is appropriate here and as a result
>> your utility follows the XML guidelines, so this is a good fix.
>>
>> Thanks,
>> Andreas
>>
>>
>>
>> On Fri, Sep 16, 2011 at 7:27 AM, Andreas Prlic <andreas at sdsc.edu> wrote:
>>>
>>> I am just taking a look at the XML spec and what they recommend about
>>> this issue. I am sure we are not the first to encounter this...
>>>
>>> http://www.w3.org/TR/REC-xml/#sec-line-ends
>>>
>>> I haven't looked at your commit yet. I'll comment on this after I have
>>> thought about this a bit more.
>>>
>>> Andreas
>>>
>>>
>>>
>>> On Fri, Sep 16, 2011 at 6:05 AM, Amr AL-Hossary
>>> <amr_alhossary at hotmail.com> wrote:
>>>>
>>>> Well,
>>>> looking at the org.biojava.bio.structure.TestSECalignment..
>>>> Using AFPChainXMLParser.fromXML then AFPChainXMLConverter.toXML doesn't
>>>> generate the same XML in accordance to the end line delimiter: The first
>>>> implementation validated XML equality using assertEquals(String,String),
>>>> which doesn't -by all means- tolerate any difference in any single
>>>> character.
>>>>
>>>> I agree with Dr. Andreas that we should end the lines using
>>>> printStream.println(), leaving the matter of selecting which line
>>>> delimiter
>>>> to choose to the system.
>>>> The drawback of this approach is we can't guarantee where (on which OS)
>>>> was
>>>> the XML produced & where will it be consumed in order to be sure that
>>>> the
>>>> delimiter of choice is \n or \r\n.
>>>> So, we need a utility function that asserts equality of XML (if there is
>>>> not
>>>> one already present in the assertXXX() suit).
>>>>
>>>> So, as a 1ry solution, I made a utility method that compares Strings
>>>> line by
>>>> line, ignoring the end line delimiter. I used the standard java class
>>>> java.util.Scanner because it tolerate all & every type of line
>>>> delimiter.
>>>> Here is a line of Scanner source code:
>>>> private static final String LINE_SEPARATOR_PATTERN =
>>>> "\r\n|[\n\r\u2028\u2029\u0085]";
>>>>
>>>>
>>>> Please inspect SVN revision 9232 to have my full picture. I welcome all
>>>> comments
>>>>
>>>>
>>>> Amr
>>>> I hope this mail is delivered to the group this time :(
>>>>
>>>> ------------------
>>>>
>>>> --------------------------------------------------
>>>> From: "Andreas Prlic" <andreas at sdsc.edu>
>>>> Sent: Thursday, September 15, 2011 8:57 PM
>>>> To: "Scooter Willis" <HWillis at scripps.edu>
>>>> Cc: "Amr AL-Hossary" <amr_alhossary at hotmail.com>;
>>>> <biojava-dev at lists.open-bio.org>
>>>> Subject: Re: [Biojava-dev] newLine is not consistent across platforms
>>>>
>>>>> ok, that sounds like a bug with the genome browser who shall not be
>>>>> named. I still think the correct default behaviour for an API is to
>>>>> use the system property. The GFF3 export method could allow to work
>>>>> around this by getting a flag "use unix style newline". If people
>>>>> think this is a problem at more places, we can provide a central
>>>>> utility method which could allow to switch the newline across the
>>>>> API..
>>>>>
>>>>> A
>>>>>
>>>>> On Thu, Sep 15, 2011 at 10:59 AM, Scooter Willis <HWillis at scripps.edu>
>>>>> wrote:
>>>>>>
>>>>>> Andreas
>>>>>>
>>>>>> You can't win that way either. As an example I think GFF3 file format
>>>>>> when
>>>>>> used with a unmentioned open source genome browser will only work if
>>>>>> \n
>>>>>> is
>>>>>> the line terminator.
>>>>>>
>>>>>> Scooter
>>>>>>
>>>>>> On 9/15/11 1:20 PM, "Andreas Prlic" <andreas at sdsc.edu> wrote:
>>>>>>
>>>>>>> Hi Amr,
>>>>>>>
>>>>>>> yea, the newline should never be written as \n or \r\n but requested
>>>>>>> from System.getProperty("line.separator"); Did we have many instances
>>>>>>> of this? I thought we were pretty consistent in avoiding this.. I am
>>>>>>> not sure if we need a central place for this. Perhaps all we need is
>>>>>>> to remind all developers to avoid hard coding this and using the
>>>>>>> System property instead.
>>>>>>>
>>>>>>> Andreas
>>>>>>>
>>>>>>>
>>>>>>> 2011/9/15 Amr AL-Hossary <amr_alhossary at hotmail.com>:
>>>>>>>>
>>>>>>>> Here is another assertion exception:
>>>>>>>> The end line delimiter is different across platforms.
>>>>>>>> So, I created a new helper class for common String assertion
>>>>>>>> manipulation
>>>>>>>> tasks.
>>>>>>>>
>>>>>>>> Please feel free to use it in all common String manipulation tasks.
>>>>>>>>
>>>>>>>> Well, my question is: where could it be put (a common place) to be
>>>>>>>> used
>>>>>>>> by
>>>>>>>> all test classes?
>>>>>>>>
>>>>>>>> Amr
>>>>>>>> ------------
>>>>>>>> This mail is sent for the 5th time
>>>>>>>>
>>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> biojava-dev mailing list
>>>>>>> biojava-dev at lists.open-bio.org
>>>>>>> http://lists.open-bio.org/mailman/listinfo/biojava-dev
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>
>> _______________________________________________
>> biojava-dev mailing list
>> biojava-dev at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biojava-dev
>>
> _______________________________________________
> biojava-dev mailing list
> biojava-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biojava-dev
>



More information about the biojava-dev mailing list