[Biojava-dev] fetching obsolete/superseding files

Andreas Prlic andreas at sdsc.edu
Wed Mar 2 05:19:33 UTC 2011


Hi Amr,

(no need to address me by title)

>
> I was using a PDB files set, mentioned in an old paper, published in 1994.
> the paper is called
> Enlarged representative set of protein structures
> by
> UWE HOBOHM AND CHRIS SANDER
> European Molecular Biology Laboratory, 69012 Heidelberg, Germany
> (RECEIVEDS eptember 16, 1993; ACCEPTEDD ecember 23, 1993)
> published in
> Protein Science (1994), 3522-524. Cambridge University Press. Printed in the
> USA.
>
> It describes a representative standard set of protein structures that
> doesn't have any redundancy.
> This set was cited by a paper that talks about Cation-pi interactions as
> their representative set; and I was revisiting the same set to use it as my
> positive control in my research.

there is significanlty more data in the PDB nowadays, not sure how
this affects this data set / your work...


> Generally, I agree with you in not letting the parser be aware of versions,
> but I believe it should be at least aware of revisions of the file up to the
> point the local copy was created, and let the user be notified that this
> data is up to the date this file was created and could be outdated; in
> addition to mentioning it explicitly in the documentation.

Yes, I guess it can't hurt to get another field in the PDBHeader class
that can list this data, if it is in the file.

>
> Well,
> Another point to think about:
> How to fight redundancy among several files?

Hm. is this important? I would filter redundancy based on sequence or
structure similarity, not based on IDs.


>
> How to counteract the redundancy in 2HHB, 3HHB, as long as 4HHB is already
> there !

they are all valid IDs. Based on what criteria would you like to filter?

Andreas


-- 
-----------------------------------------------------------------------
Dr. Andreas Prlic
Senior Scientist, RCSB PDB Protein Data Bank
University of California, San Diego
(+1) 858.246.0526
-----------------------------------------------------------------------



More information about the biojava-dev mailing list