[Bioperl-l] Bio::DB::GFF/Postgres test failures

Lincoln Stein lincoln.stein at gmail.com
Mon Apr 11 18:57:44 UTC 2011


It now looks like Scott and I need to fix the mysql table definition
mechanism in Bio::DB::GFF and Bio::SeqFeature::Store. Some versions of mysql
are not accepting the type=MyISAM declaration.

Lincoln

On Mon, Apr 11, 2011 at 1:35 PM, Chris Fields <cjfields at illinois.edu> wrote:

> Scott,
>
> Go ahead and change the test if you haven't done it already.  The only
> blocker would be the Module::Build work I alluded to earlier, which I'm
> working on now.
>
> Lincoln, probably not necessary to worry about changing the version number
> for the GBrowse release, we can deal with bug fixes with minor version
> increments on 1.6.9.  Most end-users won't care about versioning anyway as
> long as the dependency install path works fine.
>
> chris
>
> On Apr 11, 2011, at 12:27 PM, Scott Cain wrote:
>
> > OK, I fixed the second bug I referred to below (and committed it on
> > master :-)  If nobody complains, I also change the test I referred to
> > first below, and then my issues with the next release should be
> > resolved.
> >
> > Thanks,
> > Scott
> >
> >
> > On Mon, Apr 11, 2011 at 11:53 AM, Scott Cain <scott at scottcain.net>
> wrote:
> >> The test I was complaining about last week is clearly flawed.  I would
> >> suggest changing the "doesnotexit" to "-1" in this test:
> >>
> >>  is( $db->fetch('doesnotexit'), undef);
> >>
> >> where the point of the test is to search for something by ID that
> >> doesn't exist.  Since a primary key of "-1" is unlikely to be used
> >> with an autogenerated primary key, that should safely fail to find
> >> something, thus passing the test.
> >>
> >> There is, however, a second failure with the Pg test that I'm working
> >> on today, where for some reason, the same method in the mysql adaptor
> >> (inherited by the Pg adaptor) is generating different queries when run
> >> against the different databases.  Once I sort out why that is
> >> happening, the Pg adaptor should be passing tests again.
> >>
> >> Scott
> >>
> >>
> >> On Mon, Apr 11, 2011 at 10:15 AM, Chris Fields <cjfields at illinois.edu>
> wrote:
> >>> Lincoln,
> >>>
> >>> It's Bio::Aseembly-specific incorrect semantics that are triggering
> this, I don't think anything in the SF::Store/GBrowse set triggers the
> problem.  So, to me this isn't a blocker for anything unless someone
> specifies that Bio::Assembly must use the Pg adaptor (I think it uses the
> memory one by default, not sure if it's hard-coded that way).
> >>>
> >>> I do agree with Scott, that MySQL and other adaptors are silently
> dealing with this data w/o dying, so this should be filed for tracking, as
> I'm sure it will pop up at some point again.  The relevant tests should be
> TODO'd to catch this.
> >>>
> >>> Since this will likely be the last release for 1.6.x, I suppose we can
> go ahead and leave the version number as 1.0069.
> >>>
> >>> chris
> >>>
> >>> On Apr 10, 2011, at 4:10 PM, Lincoln Stein wrote:
> >>>
> >>>> Hi Folks,
> >>>>
> >>>> Is this what's blocking the netx bioperl release, or are there other
> >>>> blockers?
> >>>>
> >>>> I just released GBrowse 2.27, which requires bioperl 1.0069 or higher.
> >>>>
> >>>> Lincoln
> >>>>
> >>>> On Fri, Apr 8, 2011 at 8:53 PM, Chris Fields <cjfields at illinois.edu>
> wrote:
> >>>>
> >>>>> That was introduced here by Florent:
> >>>>>
> >>>>>
> >>>>>
> https://github.com/bioperl/bioperl-live/commit/6f65223ef5aabc3ceaa815d3cb71982f81ae6b30#t/LocalDB/SeqFeature.t
> >>>>>
> >>>>> So, essentially the MySQL adaptor is getting this wrong.  Any way we
> can
> >>>>> somehow enable strict mode?
> >>>>>
> >>>>> http://dev.mysql.com/doc/refman/5.0/en/server-sql-mode.html
> >>>>>
> >>>>> chris
> >>>>>
> >>>>> On Apr 8, 2011, at 4:32 PM, Scott Cain wrote:
> >>>>>
> >>>>>> Argh!  MySQL is not a RDMS!  Anyone who tells you otherwise is
> lying!
> >>>>>>
> >>>>>> The first test failure for the Pg SFS adaptor is failing because it
> is
> >>>>>> trying to execute this query (which it inherited from the mysql
> >>>>>> adaptor, where it works just fine):
> >>>>>>
> >>>>>> select id,object FROM bioperl_seqfeature_t_test_schema_feature where
> >>>>>> id='doesnotexit';
> >>>>>>
> >>>>>> Of course, the id column is defined as an integer column.  MySQL
> must
> >>>>>> be silently casting this string to an integer value (? I guess
> >>>>>> anyway--who knows).  Anyway, PostgreSQL does the right thing and
> >>>>>> throws an error with this query.  I don't see how I can make the
> >>>>>> Postgres adaptor pass this test as written, as it is nonsensical.
> >>>>>>
> >>>>>> Scott
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Fri, Apr 8, 2011 at 2:56 PM, Chris Fields <cjfields at illinois.edu
> >
> >>>>> wrote:
> >>>>>>> Scott,
> >>>>>>>
> >>>>>>> I'll try documenting the Pg error for SF::Store in the next hour.
>  Had
> >>>>> my hands full with the GSoC onslaught of emails and local $job stuff.
>  Would
> >>>>> like to get it fixed for the CPAN release.
> >>>>>>>
> >>>>>>> chris
> >>>>>>>
> >>>>>>> On Apr 8, 2011, at 12:58 PM, Scott Cain wrote:
> >>>>>>>
> >>>>>>>> OK, I'll take it out and move on to the next problem.
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Scott
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Fri, Apr 8, 2011 at 1:51 PM, Lincoln Stein <
> lincoln.stein at gmail.com>
> >>>>> wrote:
> >>>>>>>>> Oh right. The Bio::DB::GFF adaptor has that broken behavior and
> it is
> >>>>> too
> >>>>>>>>> late to change it now. (Bio::DB::SeqFeature::Store had better
> not!)
> >>>>> Best to
> >>>>>>>>> remove the test altogether.
> >>>>>>>>> Lincoln
> >>>>>>>>>
> >>>>>>>>> On Fri, Apr 8, 2011 at 1:18 PM, Scott Cain <scott at scottcain.net>
> >>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>> Hi Lincoln,
> >>>>>>>>>>
> >>>>>>>>>> Yes, apparently, it does.  It does this for both the memory and
> the
> >>>>>>>>>> postgres adaptors.  I looked at how the data was stored in the
> >>>>> feature
> >>>>>>>>>> object with Data::Dumper and that is how it is represented in
> the
> >>>>> hash
> >>>>>>>>>> too.  Perhaps this test should be calling the "absolute" method
> >>>>> first?
> >>>>>>>>>>
> >>>>>>>>>> Scott
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On Fri, Apr 8, 2011 at 1:10 PM, Lincoln Stein <
> >>>>> lincoln.stein at gmail.com>
> >>>>>>>>>> wrote:
> >>>>>>>>>>> Do start() and end() flip values for minus strand features?
> This
> >>>>> isn't
> >>>>>>>>>>> supposed to happen.
> >>>>>>>>>>> Lincoln
> >>>>>>>>>>>
> >>>>>>>>>>> On Fri, Apr 8, 2011 at 11:41 AM, Scott Cain <
> scott at scottcain.net>
> >>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>> Hi Lincoln,
> >>>>>>>>>>>>
> >>>>>>>>>>>> I've been looking into some test failures with the postgres
> adaptor
> >>>>>>>>>>>> for Bio::DB::GFF and I wanted to check with you that I'm
> >>>>> interpreting
> >>>>>>>>>>>> this correctly.  In t/LocalDB/BioDBGFF.t there are these
> lines:
> >>>>>>>>>>>>
> >>>>>>>>>>>> @features = sort {$a->start<=>$b->start} @features;
> >>>>>>>>>>>>
> >>>>>>>>>>>> is($features[0]->type,'Component:reference');
> >>>>>>>>>>>> is($features[-1]->type,'exon:confirmed');
> >>>>>>>>>>>>
> >>>>>>>>>>>> So that the features in the data set are sorted by their start
> >>>>> values
> >>>>>>>>>>>> and the beginning and end of the list are checked.  The test
> refers
> >>>>> to
> >>>>>>>>>>>> the test.gff data file, that contains among others these
> lines:
> >>>>>>>>>>>>
> >>>>>>>>>>>> Contig1 confirmed   transcript      30001   31000   .   -   .
> >>>>>>>>>>>> Transcript trans-2; Gene "xyz-2"; Note "Terribly interesting"
> >>>>>>>>>>>> Contig1 confirmed   exon    30001   30100   .   -   .
> Transcript
> >>>>>>>>>>>> trans-2; Gene "abc-1"; Note "function unknown"
> >>>>>>>>>>>> Contig1 confirmed   exon    30701   30800   .   -   .
> Transcript
> >>>>>>>>>>>> trans-2
> >>>>>>>>>>>> Contig1 confirmed   exon    30801   31000   .   -   .
> Transcript
> >>>>>>>>>>>> trans-2
> >>>>>>>>>>>>
> >>>>>>>>>>>> Since this transcript and its exons are on the minus strand,
> the
> >>>>>>>>>>>> values that the start and stop method return will be reversed,
> so
> >>>>> that
> >>>>>>>>>>>> start for the transcript will be 31000 and stop will be 30001.
>  The
> >>>>>>>>>>>> problem with this test is since the last exon and the
> transcript
> >>>>> share
> >>>>>>>>>>>> a start value (31000), you can't really be sure which one will
> be
> >>>>> at
> >>>>>>>>>>>> the bottom of the list after sorting, right?  In the case of
> the
> >>>>>>>>>>>> postgres adaptor, it fails this test on my machine because the
> >>>>>>>>>>>> transcript is at the bottom of the list.  The test for the
> >>>>> beginning
> >>>>>>>>>>>> of the list similarly could fail though it didn't in my case,
> as
> >>>>> other
> >>>>>>>>>>>> features that have 1 as a start are of type "Component:clone".
> >>>>>>>>>>>>
> >>>>>>>>>>>> So, my question is this: am I missing something, and the
> postgres
> >>>>>>>>>>>> adaptor is not behaving as expected, or are these tests
> ambiguous?
> >>>>>>>>>>>>
> >>>>>>>>>>>> Thanks,
> >>>>>>>>>>>> Scott
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> --
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>
> ------------------------------------------------------------------------
> >>>>>>>>>>>> Scott Cain, Ph. D.                                   scott at
> >>>>> scottcain
> >>>>>>>>>>>> dot net
> >>>>>>>>>>>> GMOD Coordinator (http://gmod.org/)
> >>>>> 216-392-3087
> >>>>>>>>>>>> Ontario Institute for Cancer Research
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> --
> >>>>>>>>>>> Lincoln D. Stein
> >>>>>>>>>>> Director, Informatics and Biocomputing Platform
> >>>>>>>>>>> Ontario Institute for Cancer Research
> >>>>>>>>>>> 101 College St., Suite 800
> >>>>>>>>>>> Toronto, ON, Canada M5G0A3
> >>>>>>>>>>> 416 673-8514
> >>>>>>>>>>> Assistant: Renata Musa <Renata.Musa at oicr.on.ca>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> --
> >>>>>>>>>>
> >>>>>
> ------------------------------------------------------------------------
> >>>>>>>>>> Scott Cain, Ph. D.                                   scott at
> >>>>> scottcain
> >>>>>>>>>> dot net
> >>>>>>>>>> GMOD Coordinator (http://gmod.org/)
> 216-392-3087
> >>>>>>>>>> Ontario Institute for Cancer Research
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> --
> >>>>>>>>> Lincoln D. Stein
> >>>>>>>>> Director, Informatics and Biocomputing Platform
> >>>>>>>>> Ontario Institute for Cancer Research
> >>>>>>>>> 101 College St., Suite 800
> >>>>>>>>> Toronto, ON, Canada M5G0A3
> >>>>>>>>> 416 673-8514
> >>>>>>>>> Assistant: Renata Musa <Renata.Musa at oicr.on.ca>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>>
> >>>>>
> ------------------------------------------------------------------------
> >>>>>>>> Scott Cain, Ph. D.                                   scott at
> scottcain
> >>>>> dot net
> >>>>>>>> GMOD Coordinator (http://gmod.org/)
> 216-392-3087
> >>>>>>>> Ontario Institute for Cancer Research
> >>>>>>>>
> >>>>>>>> _______________________________________________
> >>>>>>>> Bioperl-l mailing list
> >>>>>>>> Bioperl-l at lists.open-bio.org
> >>>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>>
> ------------------------------------------------------------------------
> >>>>>> Scott Cain, Ph. D.                                   scott at
> scottcain
> >>>>> dot net
> >>>>>> GMOD Coordinator (http://gmod.org/)
> 216-392-3087
> >>>>>> Ontario Institute for Cancer Research
> >>>>>>
> >>>>>> _______________________________________________
> >>>>>> Bioperl-l mailing list
> >>>>>> Bioperl-l at lists.open-bio.org
> >>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>> --
> >>>> Lincoln D. Stein
> >>>> Director, Informatics and Biocomputing Platform
> >>>> Ontario Institute for Cancer Research
> >>>> 101 College St., Suite 800
> >>>> Toronto, ON, Canada M5G0A3
> >>>> 416 673-8514
> >>>> Assistant: Renata Musa <Renata.Musa at oicr.on.ca>
> >>>> _______________________________________________
> >>>> Bioperl-l mailing list
> >>>> Bioperl-l at lists.open-bio.org
> >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>>
> >>>
> >>
> >>
> >>
> >> --
> >> ------------------------------------------------------------------------
> >> Scott Cain, Ph. D.                                   scott at scottcain
> dot net
> >> GMOD Coordinator (http://gmod.org/)                     216-392-3087
> >> Ontario Institute for Cancer Research
> >>
> >
> >
> >
> > --
> > ------------------------------------------------------------------------
> > Scott Cain, Ph. D.                                   scott at scottcain
> dot net
> > GMOD Coordinator (http://gmod.org/)                     216-392-3087
> > Ontario Institute for Cancer Research
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


-- 
Lincoln D. Stein
Director, Informatics and Biocomputing Platform
Ontario Institute for Cancer Research
101 College St., Suite 800
Toronto, ON, Canada M5G0A3
416 673-8514
Assistant: Renata Musa <Renata.Musa at oicr.on.ca>



More information about the Bioperl-l mailing list