[BioSQL-l] genbank, references, and crc's

Bryan Cardillo dillo at pcbi.upenn.edu
Mon Apr 9 16:05:03 UTC 2007


        This is probably more of a bioperl issue, but since it was
        previously discussed here, this is where I'll continue the
        discussion.  I've just run into the same issues mentioned in
        these threads while loading some refseq sequences.

        http://lists.open-bio.org/pipermail/biosql-l/2006-July/001024.html
        http://lists.open-bio.org/pipermail/biosql-l/2006-August/001048.html


        I believe the bioperl-db patch below solves these issues.
        The crux of the problem is that the _crc64 code uses the
        authors, title, and location to determine a unique key.
        However the get_unique_key_query method only checks authors
        before deferring to a crc lookup.  The fix causes the crc key
        to be used if any of authors, title, or location is
        specified.

        Cheers,
        Bryan Cardillo
        Penn Bioinformatics Core
        University of Pennsylvania

 ReferenceAdaptor.pm |    2 +-
 1 files changed, 1 insertion(+), 1 deletion(-)

Index: ./Bio/DB/BioSQL/ReferenceAdaptor.pm
===================================================================
RCS file: /home/repository/bioperl/bioperl-db/Bio/DB/BioSQL/ReferenceAdaptor.pm,v
retrieving revision 1.24
diff -u -r1.24 ReferenceAdaptor.pm
--- ./Bio/DB/BioSQL/ReferenceAdaptor.pm	4 Jul 2006 22:23:12 -0000	1.24
+++ ./Bio/DB/BioSQL/ReferenceAdaptor.pm	9 Apr 2007 15:38:35 -0000
@@ -426,7 +426,7 @@
 	    });
 	}
     }
-    if($obj->authors()) {
+    if($obj->authors() || $obj->title() || $obj->location()) {
 	push(@ukqueries, {
 	    'doc_id' => $self->_crc64($obj),
 	});



More information about the BioSQL-l mailing list