[Biojava-dev] fetching obsolete/superseding files
Amr AL-Hossary
amr_alhossary at hotmail.com
Wed Apr 27 06:35:03 UTC 2011
It's working well now.
<idStatus>
<record structureId="1HHB" status="OBSOLETE" replacedBy="4HHB 3HHB 2HHB"/>
<record structureId="2HHB" status="CURRENT" replaces="1HHB"/>
<record structureId="3HHB" status="CURRENT" replaces="1HHB"/>
<record structureId="4HHB" status="CURRENT" replaces="1HHB"/>
<record structureId="1CAT" status="OBSOLETE" replacedBy="3CAT"/>
<record structureId="3CAT" status="OBSOLETE" replaces="1CAT"
replacedBy="8CAT 7CAT"/>
<record structureId="7CAT" status="CURRENT" replaces="3CAT"/>
<record structureId="8CAT" status="CURRENT" replaces="3CAT"/>
<record structureId="1KSA" status="OBSOLETE" replacedBy="3ENI"/>
<record structureId="3ENI" status="CURRENT" replaces="1M50 1KSA"/>
<record structureId="1M50" status="OBSOLETE" replacedBy="3ENI"/>
</idStatus>
Thank you
Thank you for update
Amr
--------------------------------------------------
From: "Spencer Bliven" <sbliven at ucsd.edu>
Sent: Wednesday, April 27, 2011 3:58 AM
To: "Amr AL-Hossary" <amr_alhossary at hotmail.com>
Cc: <biojava-dev at lists.open-bio.org>
Subject: Re: [Biojava-dev] fetching obsolete/superseding files
> Amr,
>
> Try checking idStatus again now. The latest PDB website version just went
> into production this afternoon. I currently see
> <record structureId="3CAT" status="OBSOLETE" replaces="1CAT"
> replacedBy="8CAT 7CAT"/>
>
> I merged in the code you sent me a few days ago for PDBFileReader and for
> the caching in PDBStatus. I didn't switch PDBStatus from SAX to DOM
> because
> I had already fixed that bug in another way by the time I got your code
> (thanks for pointing it out). I also added methods to AtomCache to match
> the
> setFetch* methods in PDBFileReader. I wrote some tests in TestAtomCache
> and
> it seems to be working great.
>
> Thanks for your contributions!
>
> -Spencer
>
> On Tue, Apr 26, 2011 at 2:55 AM, Amr AL-Hossary
> <amr_alhossary at hotmail.com>wrote:
>
>> The bug was fixed per "replaces", but "replacedBy" is not yet fixed.
>> Here is current result
>>
>>
>> <idStatus>
>> <record structureId="1HHB" status="OBSOLETE" replacedBy="4HHB"/>
>> <record structureId="2HHB" status="CURRENT" replaces="1HHB"/>
>> <record structureId="3HHB" status="CURRENT" replaces="1HHB"/>
>> <record structureId="4HHB" status="CURRENT" replaces="1HHB"/>
>> <record structureId="1CAT" status="OBSOLETE" replacedBy="8CAT"/>
>> <record structureId="3CAT" status="OBSOLETE" replaces="1CAT"
>> replacedBy="8CAT"/>
>>
>> <record structureId="7CAT" status="CURRENT" replaces="3CAT"/>
>> <record structureId="8CAT" status="CURRENT" replaces="3CAT"/>
>> <record structureId="1KSA" status="OBSOLETE" replacedBy="3ENI"/>
>> <record structureId="3ENI" status="CURRENT" replaces="1M50 1KSA"/>
>> <record structureId="1M50" status="OBSOLETE" replacedBy="3ENI"/>
>> </idStatus>
>>
>> Did you receive my previous mail, Dr. Andreas?
>>
>> Amr
>>
>> --------------------------------------------------
>> From: "Amr AL-Hossary" <amr_alhossary at hotmail.com>
>> Sent: Tuesday, April 26, 2011 5:03 AM
>> To: "Spencer Bliven" <sbliven at ucsd.edu>; "Andreas Prlic"
>> <andreas at sdsc.edu
>> >
>> Cc: <biojava-dev at lists.open-bio.org>
>>
>> Subject: Re: [Biojava-dev] fetching obsolete/superseding files
>>
>> Thanks Spencer,
>>> This explains a lot.
>>> This way, the current implementation you provided is right and the
>>> recursion flag is totally right.
>>>
>>> No I don't have write access yet, but Dr. Andreas had promised me to
>>> grant
>>> me the right access after my 2nd participation.
>>>
>>> the list of status messages come from looking at the internals of the
>>> PDB
>>>> website
>>>>
>>> Do you have access to the Webservice implementation?
>>>
>>> Amr
>>>
>>>
>>> From: Spencer Bliven
>>> Sent: Tuesday, April 26, 2011 1:53 AM
>>> To: Andreas Prlic
>>> Cc: Amr AL-Hossary ; biojava-dev at lists.open-bio.org
>>> Subject: Re: [Biojava-dev] fetching obsolete/superseding files
>>>
>>>
>>> Hey all,
>>>
>>> I think we are converging on a consistent model of PDB precedence. This
>>> was obscured previously by the bug in how the idStatus page listed only
>>> a
>>> single 'replacedBy' entry. Andreas has fixed this and it should go live
>>> tomorrow. I'll write some unit tests and put update biojava at the same
>>> time. Here is how things will work:
>>>
>>> PDB supersessions form a directed acyclic graph, where edges point from
>>> an obsolete ID to the entry that directly superseded it. Each record
>>> contained by idStatus contains a "replaces" attribute, which consists of
>>> a
>>> space-delimited list of incoming edges, and a "replacedBy" attribute,
>>> which
>>> consists of a space-delimited list of outgoing edges. Two examples:
>>>
>>> <idStatus>
>>> <record structureId="1CAT" status="OBSOLETE" replacedBy="3CAT"/>
>>> <record structureId="3CAT" status="OBSOLETE" replaces="1CAT"
>>> replacedBy="8CAT 7CAT"/>
>>> <record structureId="7CAT" status="CURRENT" replaces="3CAT"/>
>>> <record structureId="8CAT" status="CURRENT" replaces="3CAT"/>
>>>
>>> <record structureId="1KSA" status="OBSOLETE" replacedBy="3ENI"/>
>>> <record structureId="3ENI" status="CURRENT" replaces="1M50 1KSA"/>
>>> <record structureId="1M50" status="OBSOLETE" replacedBy="3ENI"/>
>>> </idStatus>
>>>
>>> The non-recursive versions of getReplaces/getReplacement just get the
>>> incoming/outgoing edges for a single node and require only a single REST
>>> query. The recursive versions will do a depth-first search up/down the
>>> tree
>>> and return a list of all nodes reached.
>>>
>>> Finally, the getCurrent() method should consistently return a single
>>> PDB
>>> ID from among the results of recursive-getReplacement. To be consistent
>>> with
>>> the old REST implementation, this will be the PDB ID that occurs last
>>> alphabetically. Thus getCurrent(1HHB) will give 4HHB rather than 2HHB or
>>> 3HHB, getCurrent(1CAT) will give 8CAT, and getCurrent(7CAT) will give
>>> 7CAT.
>>>
>>> Amr, I understand what you were thinking with the getNewestCurrent
>>> method. It is appealing to think of 4HHB as the representative for all
>>> four
>>> structures. However, there is a good reason that 2HHB and 3HHB are still
>>> marked as current, and I think it is misleading to include a method that
>>> favors 4HHB over other current IDs because it is alphabetically higher.
>>> We
>>> should probably leave this method out of biojava.
>>>
>>>
>>> Does anything seems wrong about this model of supersession? In
>>> particular, does this address your question about the need for the
>>> recursion
>>> flag, Amr? My plan is to commit the biojava changes shortly. Amr, do you
>>> mind if I merge in your patch with the caching and PDBFileReader updates
>>> (Do
>>> you have write access to SVN?)? Great code there!
>>>
>>> Finally, the list of status messages come from looking at the internals
>>> of the PDB website. I haven't come across any examples of them myself to
>>> test with. Many seem to be temporary statuses, for publication holds and
>>> the
>>> like. I'm content to ignore them until someone requests something
>>> specific.
>>>
>>> -Spencer
>>>
>>>
>>>
>>> On Mon, Apr 25, 2011 at 2:22 PM, Andreas Prlic <andreas at sdsc.edu>
>>> wrote:
>>>
>>> Hi Amr,
>>>
>>>
>>> > And any way, the webservice returns only ONE PDB ID max per record
>>> (please
>>> > inspect the result returned by this query
>>> >
>>> http://www.rcsb.org/pdb/rest/idStatus?structureId=1HHB,2HHB,3HHB,4HHB).
>>>
>>>
>>> I believe that is a bug, I just fixed this and it should become
>>> available with tomorrows web site update (around 00UTC).
>>>
>>>
>>> > This way, I believe the best way to get the most recent ID is
>>> getting
>>> the
>>> > isReplacedBy attribute of the record of superseded record (e.g. from
>>> 3HHB to
>>> > 1HHB and then from 1HHB to 4HHB).
>>>
>>>
>>> hope this will be simpler with the updated URL response ...
>>>
>>>
>>> Andreas
>>>
>>>
>>>
>>> _______________________________________________
>>> biojava-dev mailing list
>>>
>>> biojava-dev at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/biojava-dev
>>>
>>>
> _______________________________________________
> biojava-dev mailing list
> biojava-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biojava-dev
>
More information about the biojava-dev
mailing list