[Bioperl-l] Bio::DB::GFF/Postgres test failures

Chris Fields cjfields at illinois.edu
Mon Apr 11 14:15:32 UTC 2011


Lincoln,

It's Bio::Aseembly-specific incorrect semantics that are triggering this, I don't think anything in the SF::Store/GBrowse set triggers the problem.  So, to me this isn't a blocker for anything unless someone specifies that Bio::Assembly must use the Pg adaptor (I think it uses the memory one by default, not sure if it's hard-coded that way). 

I do agree with Scott, that MySQL and other adaptors are silently dealing with this data w/o dying, so this should be filed for tracking, as I'm sure it will pop up at some point again.  The relevant tests should be TODO'd to catch this.  

Since this will likely be the last release for 1.6.x, I suppose we can go ahead and leave the version number as 1.0069.

chris

On Apr 10, 2011, at 4:10 PM, Lincoln Stein wrote:

> Hi Folks,
> 
> Is this what's blocking the netx bioperl release, or are there other
> blockers?
> 
> I just released GBrowse 2.27, which requires bioperl 1.0069 or higher.
> 
> Lincoln
> 
> On Fri, Apr 8, 2011 at 8:53 PM, Chris Fields <cjfields at illinois.edu> wrote:
> 
>> That was introduced here by Florent:
>> 
>> 
>> https://github.com/bioperl/bioperl-live/commit/6f65223ef5aabc3ceaa815d3cb71982f81ae6b30#t/LocalDB/SeqFeature.t
>> 
>> So, essentially the MySQL adaptor is getting this wrong.  Any way we can
>> somehow enable strict mode?
>> 
>> http://dev.mysql.com/doc/refman/5.0/en/server-sql-mode.html
>> 
>> chris
>> 
>> On Apr 8, 2011, at 4:32 PM, Scott Cain wrote:
>> 
>>> Argh!  MySQL is not a RDMS!  Anyone who tells you otherwise is lying!
>>> 
>>> The first test failure for the Pg SFS adaptor is failing because it is
>>> trying to execute this query (which it inherited from the mysql
>>> adaptor, where it works just fine):
>>> 
>>> select id,object FROM bioperl_seqfeature_t_test_schema_feature where
>>> id='doesnotexit';
>>> 
>>> Of course, the id column is defined as an integer column.  MySQL must
>>> be silently casting this string to an integer value (? I guess
>>> anyway--who knows).  Anyway, PostgreSQL does the right thing and
>>> throws an error with this query.  I don't see how I can make the
>>> Postgres adaptor pass this test as written, as it is nonsensical.
>>> 
>>> Scott
>>> 
>>> 
>>> 
>>> On Fri, Apr 8, 2011 at 2:56 PM, Chris Fields <cjfields at illinois.edu>
>> wrote:
>>>> Scott,
>>>> 
>>>> I'll try documenting the Pg error for SF::Store in the next hour.  Had
>> my hands full with the GSoC onslaught of emails and local $job stuff.  Would
>> like to get it fixed for the CPAN release.
>>>> 
>>>> chris
>>>> 
>>>> On Apr 8, 2011, at 12:58 PM, Scott Cain wrote:
>>>> 
>>>>> OK, I'll take it out and move on to the next problem.
>>>>> 
>>>>> Thanks,
>>>>> Scott
>>>>> 
>>>>> 
>>>>> On Fri, Apr 8, 2011 at 1:51 PM, Lincoln Stein <lincoln.stein at gmail.com>
>> wrote:
>>>>>> Oh right. The Bio::DB::GFF adaptor has that broken behavior and it is
>> too
>>>>>> late to change it now. (Bio::DB::SeqFeature::Store had better not!)
>> Best to
>>>>>> remove the test altogether.
>>>>>> Lincoln
>>>>>> 
>>>>>> On Fri, Apr 8, 2011 at 1:18 PM, Scott Cain <scott at scottcain.net>
>> wrote:
>>>>>>> 
>>>>>>> Hi Lincoln,
>>>>>>> 
>>>>>>> Yes, apparently, it does.  It does this for both the memory and the
>>>>>>> postgres adaptors.  I looked at how the data was stored in the
>> feature
>>>>>>> object with Data::Dumper and that is how it is represented in the
>> hash
>>>>>>> too.  Perhaps this test should be calling the "absolute" method
>> first?
>>>>>>> 
>>>>>>> Scott
>>>>>>> 
>>>>>>> 
>>>>>>> On Fri, Apr 8, 2011 at 1:10 PM, Lincoln Stein <
>> lincoln.stein at gmail.com>
>>>>>>> wrote:
>>>>>>>> Do start() and end() flip values for minus strand features? This
>> isn't
>>>>>>>> supposed to happen.
>>>>>>>> Lincoln
>>>>>>>> 
>>>>>>>> On Fri, Apr 8, 2011 at 11:41 AM, Scott Cain <scott at scottcain.net>
>> wrote:
>>>>>>>>> 
>>>>>>>>> Hi Lincoln,
>>>>>>>>> 
>>>>>>>>> I've been looking into some test failures with the postgres adaptor
>>>>>>>>> for Bio::DB::GFF and I wanted to check with you that I'm
>> interpreting
>>>>>>>>> this correctly.  In t/LocalDB/BioDBGFF.t there are these lines:
>>>>>>>>> 
>>>>>>>>> @features = sort {$a->start<=>$b->start} @features;
>>>>>>>>> 
>>>>>>>>> is($features[0]->type,'Component:reference');
>>>>>>>>> is($features[-1]->type,'exon:confirmed');
>>>>>>>>> 
>>>>>>>>> So that the features in the data set are sorted by their start
>> values
>>>>>>>>> and the beginning and end of the list are checked.  The test refers
>> to
>>>>>>>>> the test.gff data file, that contains among others these lines:
>>>>>>>>> 
>>>>>>>>> Contig1 confirmed   transcript      30001   31000   .   -   .
>>>>>>>>> Transcript trans-2; Gene "xyz-2"; Note "Terribly interesting"
>>>>>>>>> Contig1 confirmed   exon    30001   30100   .   -   .   Transcript
>>>>>>>>> trans-2; Gene "abc-1"; Note "function unknown"
>>>>>>>>> Contig1 confirmed   exon    30701   30800   .   -   .   Transcript
>>>>>>>>> trans-2
>>>>>>>>> Contig1 confirmed   exon    30801   31000   .   -   .   Transcript
>>>>>>>>> trans-2
>>>>>>>>> 
>>>>>>>>> Since this transcript and its exons are on the minus strand, the
>>>>>>>>> values that the start and stop method return will be reversed, so
>> that
>>>>>>>>> start for the transcript will be 31000 and stop will be 30001.  The
>>>>>>>>> problem with this test is since the last exon and the transcript
>> share
>>>>>>>>> a start value (31000), you can't really be sure which one will be
>> at
>>>>>>>>> the bottom of the list after sorting, right?  In the case of the
>>>>>>>>> postgres adaptor, it fails this test on my machine because the
>>>>>>>>> transcript is at the bottom of the list.  The test for the
>> beginning
>>>>>>>>> of the list similarly could fail though it didn't in my case, as
>> other
>>>>>>>>> features that have 1 as a start are of type "Component:clone".
>>>>>>>>> 
>>>>>>>>> So, my question is this: am I missing something, and the postgres
>>>>>>>>> adaptor is not behaving as expected, or are these tests ambiguous?
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Scott
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> --
>>>>>>>>> 
>>>>>>>>> 
>> ------------------------------------------------------------------------
>>>>>>>>> Scott Cain, Ph. D.                                   scott at
>> scottcain
>>>>>>>>> dot net
>>>>>>>>> GMOD Coordinator (http://gmod.org/)
>> 216-392-3087
>>>>>>>>> Ontario Institute for Cancer Research
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> --
>>>>>>>> Lincoln D. Stein
>>>>>>>> Director, Informatics and Biocomputing Platform
>>>>>>>> Ontario Institute for Cancer Research
>>>>>>>> 101 College St., Suite 800
>>>>>>>> Toronto, ON, Canada M5G0A3
>>>>>>>> 416 673-8514
>>>>>>>> Assistant: Renata Musa <Renata.Musa at oicr.on.ca>
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> 
>> ------------------------------------------------------------------------
>>>>>>> Scott Cain, Ph. D.                                   scott at
>> scottcain
>>>>>>> dot net
>>>>>>> GMOD Coordinator (http://gmod.org/)                     216-392-3087
>>>>>>> Ontario Institute for Cancer Research
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Lincoln D. Stein
>>>>>> Director, Informatics and Biocomputing Platform
>>>>>> Ontario Institute for Cancer Research
>>>>>> 101 College St., Suite 800
>>>>>> Toronto, ON, Canada M5G0A3
>>>>>> 416 673-8514
>>>>>> Assistant: Renata Musa <Renata.Musa at oicr.on.ca>
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> 
>> ------------------------------------------------------------------------
>>>>> Scott Cain, Ph. D.                                   scott at scottcain
>> dot net
>>>>> GMOD Coordinator (http://gmod.org/)                     216-392-3087
>>>>> Ontario Institute for Cancer Research
>>>>> 
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>> 
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> ------------------------------------------------------------------------
>>> Scott Cain, Ph. D.                                   scott at scottcain
>> dot net
>>> GMOD Coordinator (http://gmod.org/)                     216-392-3087
>>> Ontario Institute for Cancer Research
>>> 
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> 
>> 
> 
> 
> -- 
> Lincoln D. Stein
> Director, Informatics and Biocomputing Platform
> Ontario Institute for Cancer Research
> 101 College St., Suite 800
> Toronto, ON, Canada M5G0A3
> 416 673-8514
> Assistant: Renata Musa <Renata.Musa at oicr.on.ca>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l





More information about the Bioperl-l mailing list