[BioSQL-l] Bug in loading duplicate but non-identical swissprot references

Elia Stupka elia at tll.org.sg
Thu Apr 17 17:56:35 EDT 2003


> Once it's been located you can just say $reference->store(), and the 
> medline ID would be updated.
> The problem is in locating

Not sure I follow... as I say right now I inverted the logic in the 
get_unique_key_query and it stores the medline fine as well as not 
duplicating the reference...

> , and I'd be happy to hear how the 'old' bioperl-db would have solved 
> this, given that it did not employ the solution strategy outlined in 
> the aforementioned thread under 3)

;) Sorry, I was talking about the veeeery old bioperl-db back in the 
days when we wrote it, of course it was nothing comparable and nothing 
robust, but what I meant is I could have easily put in special case 
code... don't worry I guess I am just too lost as yet and am trying to 
find my way around BioSQL.

> The problem is that one and the same reference isn't always given with 
> the same literal title/author/journal or medline ID. I.e., you do not 
> know a-priori which search is going to locate the reference at hand in 
> the database. Sometimes (in fact, very often) it is going to be the 
> medline ID, *not* the CRC (due to slight variations in authors or 
> journal between say Swissprot and genbank). It could, however, also be 
> the pubmed ID. Or indeed the CRC.

Yep, got that...

> What needs to be done is overriding find_by_unique_key() in 
> Bio::DB::BioSQL::ReferenceAdaptor and calling the inherited method 
> with the three searches until it is found. I can do this tomorrow 
> (today that is). It shouldn't be that hard.

Ok, that's much clearer, it's end of day here and beginning of 
long-weekend, but if you don't find the time to do it I'll be happy to 
give it a shot after the week-end.

> That's not really such a different angle. More or less, this is what I 
> have been using it for. The problem you've encountered is just one we 
> didn't have before Singapore (not because of the schema changes, but 
> because swissprot changed).

Ok, understood, as I said above, I am just exposing my ignorance, 
should be back on top of things as I get into it, I already am starting 
to make some sense of it.

Elia

---
Bioinformatics Program Manager
Temasek Life Sciences Laboratory
1, Research Link
Singapore 117604
Tel. +65 6874 4945
Fax. +65 6872 7007



More information about the BioSQL-l mailing list