[Biopython-dev] WIT and KEGG

Cayte katel at worldpath.net
Sun Aug 12 01:52:22 EDT 2001


----- Original Message -----
From: "Tarjei S Mikkelsen" <tarjei at genome.wi.mit.edu>
>  I'm not too fond of adding this to the format file. HTML markup isn't
> part of the KEGG format description, so this seems a bit ad hoc.
>
>  Instead I suggest that you either run the input through
> File.SGMLHandle or File.SGMLStripper before you pass the
> WIT record to KEGG.Enzyme.Parser OR write a separate Parser
> class in your WIT module that wraps a ParserSupport.SGMLStrippingConsumer
> around KEGG.Enzyme._Consumer.
>
  The problem is I'm experimenting with a filter to strip out junk ( not
necessarily html ) between records.
The motivation is that I've had Martel fail on just an extraneous line feed.
Somehow the idea of chaining two filters together trips a watch for bugs
alarm in my mind.

> >   The format failed halfway through the file.  I think the problem is
the
> > order of entries.  The format specifies GENES before MOTIF but
> > this order is
> > reversed in the test file.  Maybe the format should be less sensitive to
> > order ,where it doesn't convey information.
>
>  Yeah, the entries are supposed to come in a specified order, but even
> the KEGG people don't follow that rule. I've committed a change to
> KEGG.Enzyme.enzyme_format.py that assumes very little about entry
> ordering. If that's the error, it should work for you now.
>

Now its stopping on files with db links like this example:

            PIR: B49338  B49935  E64239  KIECAA

These are quibbles but the computer doesn't understand quibbles:).

                                                                 Cayte
>  Tarjei
>
>




More information about the Biopython-dev mailing list