[Biojava-dev] GenbankFormat (biojavax) and comments with leading whitespace

Mark Schreiber markjschreiber at gmail.com
Sat Sep 30 12:29:41 UTC 2006


I think this should be fine to commit as long as biojava can still
read in the file again (and other files).

You should probably also comment the code to say VNTI needs this and
to be doubly certain put in a unit test.

- Mark

On 9/30/06, Bubba Puryear <bubba.puryear at gmail.com> wrote:
> Hey all,
>
>   I've been using biojava for some time now on my project for reading
> genbank flat files, but until reacently I haven't been writing any.
> Our client makes extensive use of VectorNTI (version 9, I think) and I
> was doing some edits to genbank files (via biojavax) and notice that
> comment values get their whitespace trimmed.
>
>   Turns out VNTI splats a load of state that it needs in the comment
> section is a fairly lispish looking syntax... but indentation appears
> to be important. In particular, VNTI won't read the files I've edited
> that have had their whitespace munged. I have some local changes to
> the parser that preserve leading/trailing whitespace for section
> values for top level sections.
>
>   I've run the tests locally (and added one for testing indented
> comments) and run this against ~ 3000 files I have locally. I wanted
> to get some feedback on this before I committed, though.
>
>   As an example of the kind of thing that currently gets munged:
>
> COMMENT     Vector_NTI_Display_Data_(Do_Not_Edit!)
> COMMENT     (SXF
> COMMENT      (CGexDoc "11460" 0 6359
> COMMENT       (CDBMol 0 0 1 1 1 0 0 1633772385 0 "" "" 0 0 0 0
> (CObList) (CObList)
> COMMENT        (CObList) (CObList) -1)
> COMMENT       (CDocSetData 1 0 0 0 0 0 "MAIN" 1 1 1 1 0 0 1 1 0 1 10 5
> 40 50 0 1 0
> ....
>
>    The level of indentation can get quite deep.
>
> Thanks,
> Bubba
> _______________________________________________
> biojava-dev mailing list
> biojava-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biojava-dev
>



More information about the biojava-dev mailing list