[Biojava-l] seeking comments on proposed changes

Scott Markel smarkel@netgenics.com
Wed, 29 Nov 2000 18:29:21 -0800


We'd like to propose some changes and would like to get the group's
feedback.

  * Location.empty.equals(Location.empty) evaluates to false.  The
    problem is that EmptyLocation returns Integer.MIN_VALUE from the
    getMax() method and the LocationComparator determines the distance
    between the max of two Locations using subtraction.  In this case of
    comparing Location.empty to itself the max values are both maximally
    negative so subtracting does not result in 0.  We'd like to change
    EmptyLocation's equals() method.

  * FastaFormat doesn't use Java-like facilities such as reading lines
    as Strings from a BufferedReader.  We tripped over this while
    tracking down a bug regarding DOS formatted end-of-line characters
    in a FASTA file.  we have a fix to the DOS format bug that could be
    checked in, but we're wondering if using BufferedReader's readLine()
    method might be a safer approach that avoids that kind of problem.

  * We also noticed that when FastaFormat processes a sequence file a
    new String object is instantiated for each character in the sequence
    so that it can be parsed and added to the SymbolList.  We've noticed
    a big performance hit for large sequences (100K - 10M bp).

    We'd like to do one of the following.

    - Add a method that mimics parseToken(), but takes a primitive char.
      This new method might live in either SymbolParser or a derived
      interface.  Change the implementation of TokenParser's parse()
      method to not use substring(), which causes more Strings to be
      instantiated.

    - Change FastaFormat to use the current interface but instantiate a
      String per symbol in the alphabet and reuse them rather than
      creating a String per sequence character.

Comments?

Scott

-- 
Scott Markel, Ph.D.       NetGenics, Inc.
smarkel@netgenics.com     4350 Executive Drive
Tel: 858 455 5223         Suite 260
FAX: 858 455 1388         San Diego, CA  92121