[Biojava-l] Questions

Thomas Down td2@sanger.ac.uk
Tue, 16 Oct 2001 15:44:21 +0100


On Tue, Oct 16, 2001 at 10:29:28AM -0400, Ron Kuhn wrote:
> Is there a forum related to the BIOJAVA software??? That is really where I
> should be posting this email.

Yes, this mailing list is where most discussion happens.  Please
post any questions/bug reports/suggestions/patches here.

> I am trying to use your software to parse Genbank and SwissProt flat files.
> I made 2 fixes (1 for each) and all seems to work fine. I am successfully
> parsing the files and interpretting sequences. My only problem is that the
> information about the references to the sequences should be heirarchical (in
> SwissProt - the RP, RX, RA, RT and RL fields relate to a specific RN) but
> the object used to hold the reference information (Annotation) is NOT (XML
> is though). If multiple RNs exist and 0 or more RTs exist per RN, then how
> are we supposed to relate the RTs to the correct RN???

BioJava's currently not very good at handling the metadata associated
with sequence files -- any improvements would be welcome.

When the Annotation objects were originally designed, there
was a general suggestion that they should be used as hierarchies
to model this kind of thing.  So you could have a structure like:


   Top-level annotation
          |
          |      references
          |----------------------> ArrayList
                                      |------->Reference annotation
                                      |           |
                                      |           +----> journal
                                      |           +----> author
                                      |           +----> title
                                      |
                                      +-------> Another reference

Actually, it might be even better to have a dedicated object
model for bibliographical references, but nobody has been particualary
enthusiastic about working on this.


     Thomas.