[Biojava-l] Bad PDB files and batch processing with PDBFileReader

Daniel Asarnow dasarnow at gmail.com
Wed Oct 27 04:59:56 UTC 2010


Glad to hear it, who doesn't like support or clean interfaces?.  No
offense intended, by the way, with respect to PDB errors - obviously
the PDB is an indispensable resource for all protein scientists.

I am looking at many (fixed-length) pieces of protein chains and doin'
stuff with 'em.  My current code has a pair of nested while loops; the
outer iterates over PDB entries (locally rsync'd copy), parsing them
and the inner iterates over the pieces from each.  When
StructureExceptions come out of my PDBFileReader object I want to
continue the outer loop, moving on to the next set of files without
executing any of the code that depends on correct StructureImpl
objects from the reader (database updates, the inner loop).
Since the reader's methods have their own try-catch blocks, a thrown
StructureException is stopped there and never reaches my own error
handling.  I just need to know when those errors occur so I can skip
those proteins - I am presuming that the correct entries will outweigh
the problem ones by a significant factor and the overall data wont be
seriously impacted.

-da

On Tue, Oct 26, 2010 at 21:11, Andreas Prlic <andreas at sdsc.edu> wrote:
> Hi Daniel,
>
> can you explain a bit more what you are doing, in particular what
> errors you would like to deal with on your end?  You should not need
> to worry too much about exception handling. Are there any special
> cases you are interested in?  In this case we should support you with
> a clean interface rather than exception handling from your end...
>
> Andreas
>
>
>
> On Tue, Oct 26, 2010 at 8:54 PM, Daniel Asarnow <dasarnow at gmail.com> wrote:
>> Hi all,
>> Let me first say thanks to all the BioJava community members for
>> delivering such a useful set of libraries, and that I'm still a newbie
>> when it comes to BioJava (and Java) so forgive me if my question is
>> too trivial.
>>
>> I am doing work on lots (at least thousands) of PDB files from RCSB.
>> As is commonly known, these are often rife with errors which can lead
>> to exceptions during parsing with PDBFileParser.  Because
>> PDBFileParser's methods contain their own try-catch blocks, exception
>> propagation stops there and my code proceeds blindly along regardless
>> of any error checking I do.  I would like to catch the exceptions up
>> in my code where the parser is called, so that I can branch to a
>> continue statement and have my batch processing loops move on to the
>> next file.
>> Should I edit out the try-catch blocks and compile my own version of
>> the library?  Or should I test the returned StructureImpl objects for
>> possession of the fields in question?  In that case, I'm not sure
>> which properties will give the most general success information...and
>> I'd rather not have to check for /every/ property being correct.
>>
>> If there is some great way to check if an exception was caught down a
>> series of nested method calls, please hit me over the head with it.
>>
>> Thanks!
>>
>> -da
>> _______________________________________________
>> Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biojava-l
>>
>
>
>
> --
> -----------------------------------------------------------------------
> Dr. Andreas Prlic
> Senior Scientist, RCSB PDB Protein Data Bank
> University of California, San Diego
> (+1) 858.246.0526
> -----------------------------------------------------------------------
>




More information about the Biojava-l mailing list