[BioSQL-l] Bug in loading duplicate but non-identical swissprot
references
Hilmar Lapp
hlapp at gnf.org
Sun Apr 20 19:27:19 EDT 2003
On Thursday, April 17, 2003, at 01:56 AM, Elia Stupka wrote:
>
>> What needs to be done is overriding find_by_unique_key() in
>> Bio::DB::BioSQL::ReferenceAdaptor and calling the inherited method
>> with the three searches until it is found. I can do this tomorrow
>> (today that is). It shouldn't be that hard.
>
> Ok, that's much clearer, it's end of day here and beginning of
> long-weekend, but if you don't find the time to do it I'll be happy to
> give it a shot after the week-end.
>
After giving it a second thought I decided why not implement this
capability in the base-adaptor find_by_unique_query implementation so
that it is available to all adaptors that want it. I've committed the
changes to both code and documentation so that it's hopefully not
entirely obfuscated how to enable this feature. Basically
get_unique_key_query now can return an array, and ReferenceAdaptor does
exactly that. The order of keys that are going to be searched for
references is
- medline ID (if $reference->medline returns a value)
- PubMed ID (if $reference->pubmed returns a value)
- CRC (if at least $reference->authors returns a value).
Also, I added code that PubMed ID substitutes for Medline ID if Medline
ID is absent (i.e., medline ID takes precedence).
All tests pass, but that doesn't mean that the case that triggered the
problem is proven to be solved, as it is not yet included in any of the
tests. I'll do that later, or Elia you're welcome to add that to a test
too.
-hilmar
BTW this is really about bioperl-db; is bioperl-l or biosql-l supposed
to be the forum for bioperl-db? Or shall it receive its own?
--
-------------------------------------------------------------
Hilmar Lapp email: lapp at gnf.org
GNF, San Diego, Ca. 92121 phone: +1-858-812-1757
-------------------------------------------------------------
More information about the BioSQL-l
mailing list