[BioSQL-l] Query Problem.
Hilmar Lapp
hlapp at gnf.org
Tue Feb 17 18:09:14 EST 2004
Hm, interesting.
Basically, the trouble maker is that you cannot apply NVL() to LOB
values. Biosequence.seq is of type CLOB. Normally, for every updated
column the new value is wrapped in a call to NVL() to prevent
overriding of values by incomplete objects (which would have certain
attributes undefined instead of the value that's in the database). So,
in order to protect column foo, you write the update statement as
UPDATE blah SET
...,
foo = NVL(?, foo),
...
WHERE oid = ?
If you bind undef to column foo's parameter, you get a no-change result
without ever having to cross-check every attribute.
This will work for the sequence so long as it is less than or equal to
4000 chars long, as then it will simply be treated as a VARCHAR2 and
Oracle will cast it internally to CLOB when it gets assigned to the seq
column. If the sequence length is longer than 4000 chars, you can't use
NVL() on it, so there is special case code for Oracle that will edit
the generated sql and replace the NVL() call with a direct assignment
iff any of the object's attribute is defined and longer than 4000 chars
in length. The get_persistent_slot_values() will return the sequence as
having value undef if it hasn't changed in order to prevent unnecessary
retrieval and update of a column of 10s/100s/1000s of kb if it hasn't
changed. An unchanged sequence should therefore not lead to the NVL()
clause to be edited away, because its value would be pretended as undef
to the database update function.
The unknown in this equation is what happens if in the NVL(?, foo) call
the bound parameter is undef (NULL), but foo has more than 4000 chars.
It shouldn't actually matter up to 32k chars length, as that is
Oracle's upper limit for VARCHAR2 in PL/SQL (i.e., while DBI/DBD
doesn't have its fingers in the game), and the NVL call is executed on
the server, not the DBI client.
Are you sure your threshold of misbehavior is 4000 and not 32k in
length?
If you turn on verbosity (call $db->verbose(1) after creating it), what
parameters do you see get bound for the sequence that gets its sequence
set to NULL after an update?
-hilmar
On Tuesday, February 17, 2004, at 08:23 AM, Marc Logghe wrote:
> Hi,
> I was not able to figure it out completely, but found a fix.
> In Bio::DB::BioSQL::BiosequenceAdaptor the marked (#ml) line was added:
>
> sub get_persistent_slot_values {
> my ($self,$obj,$fkobjs) = @_;
> my @vals;
> if($obj->isa("Bio::Seq::RichSeqI")) {
> @vals = ($obj->seq_version());
> } else {
> @vals = (undef);
> }
> #ml
> $obj->seq_has_changed(1) if $obj->length > 4000;
>
> push(@vals,
> $obj->length(),
> $obj->alphabet(),
> $obj->seq_has_changed() ? $obj->seq() : undef);
> return \@vals;
> }
>
> This forces an 'update' of the sequence if it is larger than 4000 bp.
>
>> -----Original Message-----
>> From: Marc Logghe
>> Sent: dinsdag 17 februari 2004 11:01
>> To: biosql-l at open-bio.org
>> Subject: RE: [BioSQL-l] Query Problem.
>>
>>
>> Hi,
>> I have been trying to track down a nasty bug, but failed :-(
>> Sorry, Hilmar.
>> A biosequence containing more than 4000 nt is fine. No
>> problem in loading that.
>> A colleague of mine (FredericP) found out that as soon as you
>> add a feature to the corresponding bioentry later, and you do
>> an update (store+commit) your biosequence ends up truncated (4000).
>> Hoped that
>> Bio::DB::BioSQL::Oracle::BiosequenceAdaptorDriver::get_sth
>> had to do something with it.
>> It appears that a different sth is chosen depending on e.g.
>> the field size (_upd_sth2 if length > 4000). I tried to force
>> this by changing the line "grep { $_ && (length($_) > 4000);"
>> to "grep { $_ && (length($_) > 4000 || $_ > 4000);" but that
>> was even worse: seq field in biosequence was null.
>> Anyhow, the sequence should not have been updated after all,
>> cos it was not changed.
>> Any clues ?
>> Thanks,
>> Marc
>>
>
> _______________________________________________
> BioSQL-l mailing list
> BioSQL-l at open-bio.org
> http://open-bio.org/mailman/listinfo/biosql-l
>
--
-------------------------------------------------------------
Hilmar Lapp email: lapp at gnf.org
GNF, San Diego, Ca. 92121 phone: +1-858-812-1757
-------------------------------------------------------------
More information about the BioSQL-l
mailing list