[Biojava-dev] Sequence interface - exceptions

jake at researchtogether.com jake at researchtogether.com
Tue Jul 20 09:46:39 UTC 2010


See comments in line.

Thanks,
Jake

On Tue, Jul 20, 2010 at 10:20:10AM +0800, Mark Schreiber wrote:
> I don't think it is a great idea to hide IO exceptions but you can
> reduce the burden of them.

I would normally agree with you, but as I shall point out later this will have a lot of knock on effects for the interface which may not be desirable.

> 
> You can copy the Groovy model which handles a lot of the
> try/catch/finally boiler plate code for you. Basically you make a
> helper class with methods to perform common IO operations and which
> will do it's very best to connect, read/write and clean up.
> 
> You can also think about what might actually cause an error. If you
> are reading from a local disk cache where the file address is known
> (such as a temp file) you can very nearly guarantee that the IO
> operation will succeed. So much so that you could rethrow an IO
> Exception as an error because there is very little that can be done
> about it (other than improving the cache code or getting more reliable
> hard-drives).

And this is the issue - the Sequence interface is used by a lot of different readers, some are reading from disk, others from database and in my particular case I am reading it from a URL. Also, it is possible that I will run into a lot of exceptions around XML parsing (the data from the URL) as well as HTTP errors (page not found, service unavailable etc.) 

Now, normally I would want to deal with some of the errors and only log them - e.g. a 503 I might retry a few times and if there is a problem with the XML I might try and fetch it again. 

However, I don't fully understand how the caller will expect these SequenceReaders to behave which I why I asked the question :) An IOException on a file is probably fatal but IOException on a network call is possibly recoverable, or at least wort re-trying. 

As for what can cause errors:
1. Invalid URL
2. Page(s) unavailable (4xx, 5xx)
3. Invalid/unexpected data returned (XML badly formed, FASTA invalid)
4. Change to service (if the service has changed and the parser is effectively broken)
5. Network interuptance (i.e. network timeout)

> 
> Reading a file from disk? The most likely problem is a incorrect file
> name. Other problems can probably be turned into runtime exceptions
> cause other problems are probably disk errors.
> 
> Reading from a URL, lots of things can go wrong here so you probably
> need to expose all the possible exceptions.

I will work on this assumption and change the interface accordingly, though I expect that the decision will be re-visited.

> 
> Reading from SQL? Kind of depends on the expected DB availability and
> latency. Also, if the query code (or JPA query) is coming from the
> BioJava source then an error is appropriate (the developer can't do
> much about the mistake). If the code is coming from the app developer
> then you should notify them of SQL errors.
> 
> - Mark
> 
> On Mon, Jul 19, 2010 at 11:02 PM, Richard Holland
> <holland at eaglegenomics.com> wrote:
> >
> > I often wonder what the best way of handling multiple possible internal exceptions is - particularly in cases like this when you've got HTTP and IO and many other types of exceptions which could be thrown.
> >
> > SequenceException maybe if there's something wrong with the sequence itself - but possibly otherwise a form of IOException may be more appropriate? Trouble is that then almost every BioJava3 method would throw it, as all of them potentially have IO exposure.
> >
> > I don't know. There must be experts on this in the list who can help!
> >
> > cheers,
> > Richard
> >
> >
> > On 19 Jul 2010, at 14:49, jake at researchtogether.com wrote:
> >
> > > Hi All,
> > >
> > > I've been drawing up a design for the work I have done on the NCBI SequenceReader and I've talked through some things with Scooter which I have put on the wiki at: http://www.biojava.org/wiki/BioJava3_NCBISequenceReader_Design#Design_Overview
> > >
> > > One thing I would like to throw open for discussion is the possibility of changing the Sequence interface so that the methods can throw a new exception - SequenceException.
> > >
> > > Any opinions? :)
> > >
> > > Cheers,
> > > Jake
> > > _______________________________________________
> > > biojava-dev mailing list
> > > biojava-dev at lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/biojava-dev
> >
> > --
> > Richard Holland, BSc MBCS
> > Operations and Delivery Director, Eagle Genomics Ltd
> > T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com
> > http://www.eaglegenomics.com/
> >
> >
> > _______________________________________________
> > biojava-dev mailing list
> > biojava-dev at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/biojava-dev



More information about the biojava-dev mailing list