[Biojava-l] problem with AnnotationBuilder?

Francois Pepin fpepin at cs.mcgill.ca
Wed Dec 3 22:55:52 EST 2003


Hi everyone,

I think that there might be a problem with AnnotationBuilder (parsing
off Kegg for the curious).

With the following parser:
      LineSplitParser tvp = new LineSplitParser();
      tvp.setEndOfRecord("///");
      tvp.setSplitOffset(12);
      tvp.setContinueOnEmptyTag(true);
      tvp.setTrimTag(true);
      tvp.setTrimValue(false);
      tvp.setMergeSameTag(true);

The simplest of AnnotationBuilder:
AnnotationBuilder tvl=new AnnotationBuilder(AnnotationType.ANY);

and the following text (among others):
NAME        aldehyde dehydrogenase (NAD)
            CoA-independent aldehyde dehydrogenase
            m-methylbenzaldehyde dehydrogenase
            NAD-aldehyde dehydrogenase
            NAD-dependent 4-hydroxynonenal dehydrogenase
            NAD-dependent aldehyde dehydrogenase
            NAD-linked aldehyde dehydrogenase
            propionaldehyde dehydrogenase

I end up having the following value when printing the Annotation:
NAME=propionaldehyde dehydrogenase.

Echo() shows that everything is being read properly:
1  NAME {
2    aldehyde dehydrogenase (NAD)
2    CoA-independent aldehyde dehydrogenase
2    m-methylbenzaldehyde dehydrogenase
2    NAD-aldehyde dehydrogenase
2    NAD-dependent 4-hydroxynonenal dehydrogenase
2    NAD-dependent aldehyde dehydrogenase
2    NAD-linked aldehyde dehydrogenase
2    propionaldehyde dehydrogenase
1  }

Adding the following code (*) in AnnotationBuilder (from line 133), to
check if the values indeed get to be overwritten.

public void value(TagValueContext ctxt, Object value) {
    try {
      Frame top = peek(annotationStack);

*      if (top.annotation.containsProperty(top.tag))
*          System.out.println("replacing"+
*top.annotation.getProperty(top.tag)+ " by "+value);

      top.type.setProperty(top.annotation, top.tag, value);
    } catch (ChangeVetoException cve) {
      throw new AssertionFailure(cve);
    }
  }

This gives us the very interesting output:
replacing aldehyde dehydrogenase (NAD) by CoA-independent aldehyde
dehydrogenase replacing CoA-independent aldehyde dehydrogenase by
m-methylbenzaldehyde dehydrogenase replacing m-methylbenzaldehyde
dehydrogenase by NAD-aldehyde dehydrogenase replacing NAD-aldehyde
dehydrogenase by NAD-dependent 4-hydroxynonenal dehydrogenase replacing
NAD-dependent 4-hydroxynonenal dehydrogenase by NAD-dependent aldehyde
dehydrogenase replacing NAD-dependent aldehyde dehydrogenase by
NAD-linked aldehyde dehydrogenase replacing NAD-linked aldehyde
dehydrogenase by propionaldehyde dehydrogenase

Basically every line gets to be overwritten, so only the last one
remains at the end.

Any ideas how this could be fixed, or did I do something stupid
somewhere?

Thanks,

Francois




More information about the Biojava-l mailing list