[Biojava-dev] fetching obsolete/superseding files

Amr AL-Hossary amr_alhossary at hotmail.com
Wed Apr 27 06:35:03 UTC 2011


It's working well now.
<idStatus>
<record structureId="1HHB" status="OBSOLETE" replacedBy="4HHB 3HHB 2HHB"/>
<record structureId="2HHB" status="CURRENT" replaces="1HHB"/>
<record structureId="3HHB" status="CURRENT" replaces="1HHB"/>
<record structureId="4HHB" status="CURRENT" replaces="1HHB"/>
<record structureId="1CAT" status="OBSOLETE" replacedBy="3CAT"/>
<record structureId="3CAT" status="OBSOLETE" replaces="1CAT" 
replacedBy="8CAT 7CAT"/>
<record structureId="7CAT" status="CURRENT" replaces="3CAT"/>
<record structureId="8CAT" status="CURRENT" replaces="3CAT"/>
<record structureId="1KSA" status="OBSOLETE" replacedBy="3ENI"/>
<record structureId="3ENI" status="CURRENT" replaces="1M50 1KSA"/>
<record structureId="1M50" status="OBSOLETE" replacedBy="3ENI"/>
</idStatus>
Thank you

Thank you for update

Amr
--------------------------------------------------
From: "Spencer Bliven" <sbliven at ucsd.edu>
Sent: Wednesday, April 27, 2011 3:58 AM
To: "Amr AL-Hossary" <amr_alhossary at hotmail.com>
Cc: <biojava-dev at lists.open-bio.org>
Subject: Re: [Biojava-dev] fetching obsolete/superseding files

> Amr,
>
> Try checking idStatus again now. The latest PDB website version just went
> into production this afternoon. I currently see
> <record structureId="3CAT" status="OBSOLETE" replaces="1CAT"
> replacedBy="8CAT 7CAT"/>
>
> I merged in the code you sent me a few days ago for PDBFileReader and for
> the caching in PDBStatus. I didn't switch PDBStatus from SAX to DOM 
> because
> I had already fixed that bug in another way by the time I got your code
> (thanks for pointing it out). I also added methods to AtomCache to match 
> the
> setFetch* methods in PDBFileReader. I wrote some tests in TestAtomCache 
> and
> it seems to be working great.
>
> Thanks for your contributions!
>
> -Spencer
>
> On Tue, Apr 26, 2011 at 2:55 AM, Amr AL-Hossary
> <amr_alhossary at hotmail.com>wrote:
>
>> The bug was fixed per "replaces", but "replacedBy" is not yet fixed.
>> Here is current result
>>
>>
>> <idStatus>
>> <record structureId="1HHB" status="OBSOLETE" replacedBy="4HHB"/>
>> <record structureId="2HHB" status="CURRENT" replaces="1HHB"/>
>> <record structureId="3HHB" status="CURRENT" replaces="1HHB"/>
>> <record structureId="4HHB" status="CURRENT" replaces="1HHB"/>
>> <record structureId="1CAT" status="OBSOLETE" replacedBy="8CAT"/>
>> <record structureId="3CAT" status="OBSOLETE" replaces="1CAT"
>> replacedBy="8CAT"/>
>>
>> <record structureId="7CAT" status="CURRENT" replaces="3CAT"/>
>> <record structureId="8CAT" status="CURRENT" replaces="3CAT"/>
>> <record structureId="1KSA" status="OBSOLETE" replacedBy="3ENI"/>
>> <record structureId="3ENI" status="CURRENT" replaces="1M50 1KSA"/>
>> <record structureId="1M50" status="OBSOLETE" replacedBy="3ENI"/>
>> </idStatus>
>>
>> Did you receive my previous mail, Dr. Andreas?
>>
>> Amr
>>
>> --------------------------------------------------
>> From: "Amr AL-Hossary" <amr_alhossary at hotmail.com>
>> Sent: Tuesday, April 26, 2011 5:03 AM
>> To: "Spencer Bliven" <sbliven at ucsd.edu>; "Andreas Prlic" 
>> <andreas at sdsc.edu
>> >
>> Cc: <biojava-dev at lists.open-bio.org>
>>
>> Subject: Re: [Biojava-dev] fetching obsolete/superseding files
>>
>>  Thanks Spencer,
>>> This explains a lot.
>>> This way, the current implementation you provided is right and the
>>> recursion flag is totally right.
>>>
>>> No I don't have write access yet, but Dr. Andreas had promised me to 
>>> grant
>>> me the right access after my 2nd participation.
>>>
>>>  the list of status messages come from looking at the internals of the 
>>> PDB
>>>> website
>>>>
>>> Do you have access to the Webservice implementation?
>>>
>>> Amr
>>>
>>>
>>>  From: Spencer Bliven
>>>  Sent: Tuesday, April 26, 2011 1:53 AM
>>>  To: Andreas Prlic
>>>  Cc: Amr AL-Hossary ; biojava-dev at lists.open-bio.org
>>>  Subject: Re: [Biojava-dev] fetching obsolete/superseding files
>>>
>>>
>>>  Hey all,
>>>
>>>  I think we are converging on a consistent model of PDB precedence. This
>>> was obscured previously by the bug in how the idStatus page listed only 
>>> a
>>> single 'replacedBy' entry. Andreas has fixed this and it should go live
>>> tomorrow. I'll write some unit tests and put update biojava at the same
>>> time. Here is how things will work:
>>>
>>>  PDB supersessions form a directed acyclic graph, where edges point from
>>> an obsolete ID to the entry that directly superseded it. Each record
>>> contained by idStatus contains a "replaces" attribute, which consists of 
>>> a
>>> space-delimited list of incoming edges, and a "replacedBy" attribute, 
>>> which
>>> consists of a space-delimited list of outgoing edges. Two examples:
>>>
>>>  <idStatus>
>>>  <record structureId="1CAT" status="OBSOLETE" replacedBy="3CAT"/>
>>>  <record structureId="3CAT" status="OBSOLETE" replaces="1CAT"
>>> replacedBy="8CAT 7CAT"/>
>>>  <record structureId="7CAT" status="CURRENT" replaces="3CAT"/>
>>>  <record structureId="8CAT" status="CURRENT" replaces="3CAT"/>
>>>
>>>  <record structureId="1KSA" status="OBSOLETE" replacedBy="3ENI"/>
>>>  <record structureId="3ENI" status="CURRENT" replaces="1M50 1KSA"/>
>>>  <record structureId="1M50" status="OBSOLETE" replacedBy="3ENI"/>
>>>  </idStatus>
>>>
>>>  The non-recursive versions of getReplaces/getReplacement just get the
>>> incoming/outgoing edges for a single node and require only a single REST
>>> query. The recursive versions will do a depth-first search up/down the 
>>> tree
>>> and return a list of all nodes reached.
>>>
>>>  Finally, the getCurrent() method should consistently return a single 
>>> PDB
>>> ID from among the results of recursive-getReplacement. To be consistent 
>>> with
>>> the old REST implementation, this will be the PDB ID that occurs last
>>> alphabetically. Thus getCurrent(1HHB) will give 4HHB rather than 2HHB or
>>> 3HHB, getCurrent(1CAT) will give 8CAT, and getCurrent(7CAT) will give 
>>> 7CAT.
>>>
>>>  Amr, I understand what you were thinking with the getNewestCurrent
>>> method. It is appealing to think of 4HHB as the representative for all 
>>> four
>>> structures. However, there is a good reason that 2HHB and 3HHB are still
>>> marked as current, and I think it is misleading to include a method that
>>> favors 4HHB over other current IDs because it is alphabetically higher. 
>>> We
>>> should probably leave this method out of biojava.
>>>
>>>
>>>  Does anything seems wrong about this model of supersession? In
>>> particular, does this address your question about the need for the 
>>> recursion
>>> flag, Amr? My plan is to commit the biojava changes shortly. Amr, do you
>>> mind if I merge in your patch with the caching and PDBFileReader updates 
>>> (Do
>>> you have write access to SVN?)? Great code there!
>>>
>>>  Finally, the list of status messages come from looking at the internals
>>> of the PDB website. I haven't come across any examples of them myself to
>>> test with. Many seem to be temporary statuses, for publication holds and 
>>> the
>>> like. I'm content to ignore them until someone requests something 
>>> specific.
>>>
>>>  -Spencer
>>>
>>>
>>>
>>>  On Mon, Apr 25, 2011 at 2:22 PM, Andreas Prlic <andreas at sdsc.edu> 
>>> wrote:
>>>
>>>   Hi Amr,
>>>
>>>
>>>   > And any way, the webservice returns only ONE PDB ID max per record
>>> (please
>>>   > inspect the result returned by this query
>>>   > 
>>> http://www.rcsb.org/pdb/rest/idStatus?structureId=1HHB,2HHB,3HHB,4HHB).
>>>
>>>
>>>   I believe that is a bug, I just fixed this and it should become
>>>   available with tomorrows web site update (around 00UTC).
>>>
>>>
>>>   > This way, I believe the best way to get the most recent ID is 
>>> getting
>>> the
>>>   > isReplacedBy attribute of the record of superseded record (e.g. from
>>> 3HHB to
>>>   > 1HHB and then from 1HHB to 4HHB).
>>>
>>>
>>>   hope this will be simpler with the updated URL response ...
>>>
>>>
>>>   Andreas
>>>
>>>
>>>
>>> _______________________________________________
>>> biojava-dev mailing list
>>>
>>> biojava-dev at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/biojava-dev
>>>
>>>
> _______________________________________________
> biojava-dev mailing list
> biojava-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biojava-dev
> 



More information about the biojava-dev mailing list