[Biojava-dev] Proposed change to RichFormat interface
mark.schreiber at novartis.com
mark.schreiber at novartis.com
Wed Jun 7 06:02:51 UTC 2006
That might be a more elegant solution.
Could even make the InputStream implement RichSeqIOListener thus it would
be sending data to the RichFormat and listening to what the RichFormat
makes of the data.
The InputStreamIOListener could remember when the RichFormat emits a
startXXX() event record the line number and start buffering all the data
sent as the readLine() requests are made (while also sending it to the
RichFormat). When the RichFormat emits the corresponding endXXX() event
the buffer can be cleared and the process starts again.
Only problem might be what to do when the RichFormat consumes data in
between emitting events (which is allowed).
- Mark
Michael Heuer <heuermh at acm.org>
Sent by: Michael Heuer <heuermh at shell3.shore.net>
06/07/2006 01:51 PM
To: mark.schreiber at novartis.com
cc: biojava-dev at biojava.org
Subject: Re: [Biojava-dev] Proposed change to RichFormat interface
Mark Schreiber wrote:
> Hi all -
>
> I would like to propose a change to the RichFormat interface. I think
we
> should do this now as we haven't done a stable biojavax roll out yet so
> interface
> changes should still be allowed. The additional methods would be:
>
> public String currentLine();
> public int currentLineNumber();
>
> This would make debugging a lot easier, it would also make construction
of
> a RichSeqIOListener that logs and debugs much easier. I was trying to do
> this a while back. I started a background process that parsed 6GB of
> genbank records looking for records that failed. It worked ok but would
be
>
> much better with the ability to query the RichFormat in the above way.
We
> might even be able to make it a utility that people could run on
suspect
> files and generate standard bug reports to make it easier for us to
debug
> the parser code.
>
> What do people think??
Another possibility would be to leave this sort of progress tracking up
to the client, in that they could wrap the InputStream in something like
an CountingInputStream before passing it to the parser(s):
http://jakarta.apache.org/commons/io/api-release/org/apache/commons/io/input/CountingInputStream.html
michael
More information about the biojava-dev
mailing list