[Bioperl-l] [Gmod-schema] Trying to load my first database
Scott Cain
scott at scottcain.net
Fri May 21 18:15:20 UTC 2010
Hi Daniel,
I'm cc'ing the MAKER and BioPerl lists, since this bug is germane to both lists.
Of course, the file you sent me would be the same file you sent me
yesterday; sorry for my poor memory :-)
This file uncovered a bug in BioPerl in the FeatureIO module. While
fixing the bug may be difficult, working around it might not be too
bad. Additionally, I'm not sure we should fix it right now, as this
is an effort underway to rework this section of BioPerl anyway. The
good news is that the work around is fairly simple.
In the GFF that MAKER created, when parsing prodigal output, it
generates GFF lines like this:
Contig125 pred_gff:prodigal_v2.00 match 104 1723 157.5
+ .
ID=Contig125:hit:75;Name=pred_gff_Prodigal_v2.00-Contig125-abinit-gene-0.0-mRNA-1;_AED=0.25;cscore=151.05;partial=00;rbs_motif=AGGA;rbs_spacer=5-10bp;rscore=3.57;score=157.5,157.53;sscore=6.48;tscore=3.50;type=ATG;uscore=-0.59;
The tricky part is this tag/value in the ninth column: type=ATG. The
tag "type" is (semi) reserved in Bio::FeatureIO::gff to mean what is
in the third column, so when it is parsing this line of GFF, it tries
to reassign the feature type to something that isn't valid. The work
around is pretty easy: since "type" is a problematic tag, and it
appears that the type tag here is defining the start type, I would
suggest doing a global search and replace on the file to replace
"type=" with "start_type=". I did that and the file loaded fine. I
don't know if it is MAKER that creates this tag or the BioPerl parser
for prodigal, but changing this at the source might be nice (of
course, it might also break somebody else's code :-/ I'll enter a bug
for this in the BioPerl bug tracker.
Scott
On Fri, May 21, 2010 at 1:40 PM, Daniel Quest <daniel.quest at gmail.com> wrote:
> Hi Scott,
>
> I used Maker to generate the attached file.
>
> -Daniel
>
> On Fri, May 21, 2010 at 10:34 AM, Scott Cain <scott at scottcain.net> wrote:
>> Hi Daniel,
>>
>> Please keep the schema mailing list cc'ed in so the responses can be
>> archived and more eyes than just mine can try to solve the problem.
>>
>> Can you send a sample of the GFF that is causing the problem? Any
>> ontology term that is in Chado should be "legal." If there's
>> something causing a problem, we need to figure out what it is.
>>
>> Scott
>>
>>
>> On Fri, May 21, 2010 at 1:24 PM, Daniel Quest <daniel.quest at gmail.com> wrote:
>>> Hi Scott,
>>>
>>> I am using the same image as we used in class. I was able to load
>>> each of the examples in the GMOD course (Pythium) and on the Chado
>>> website (yeast).
>>>
>>> On another note, is there an easy way to navigate the ontology terms
>>> that are legal and standard in both GFF3 and in Chado. I am having
>>> trouble understanding how to convert from an arbitrary analysis (e.g.
>>> Blasting KEGG) into a format that works.
>>>
>>> Thanks so much!
>>> -Daniel
>>>
>>> On Fri, May 21, 2010 at 9:41 AM, Scott Cain <scott at scottcain.net> wrote:
>>>> Hi Daniel,
>>>>
>>>> That error message looks like one that would come from an older
>>>> version of BioPerl. What version do you have?
>>>>
>>>> Scott
>>>>
>>>>
>>>> On Fri, May 21, 2010 at 11:51 AM, Daniel Quest <daniel.quest at gmail.com> wrote:
>>>>> Hi Scott,
>>>>>
>>>>> Thanks for the reply. Sorry, I should have been able to track down
>>>>> that error. Could you tell me what the following error means?
>>>>>
>>>>> gmod at ubuntu:~/Cthe/ProdigalONLYcthe.maker.output/cthe_datastore/Contig125$
>>>>> gmod_bulk_load_gff3.pl --organism Cthe -a -g
>>>>> /home/gmod/Cthe/ProdigalONLYcthe.maker.output/cthe_datastore/Contig125/Contig125.gff
>>>>> --noexon --recreate_cache
>>>>> (Re)creating the uniquename cache in the database...
>>>>> Creating table...
>>>>> Populating table...
>>>>> Creating indexes...
>>>>> Adjusting the primary key sequences (if necessary)...Done.
>>>>> Preparing data for inserting into the chado database
>>>>> (This may take a while ...)
>>>>>
>>>>> ------------- EXCEPTION: Bio::Root::Exception -------------
>>>>> MSG: Object Bio::Annotation::SimpleValue=HASH(0xa858ac8) was not valid
>>>>> with key type. If you were adding new keys in, perhaps you want to
>>>>> make use
>>>>> of the archetype method to allow registration to a more basic type
>>>>> STACK: Error::throw
>>>>> STACK: Bio::Root::Root::throw /usr/local/share/perl/5.10.0/Bio/Root/Root.pm:368
>>>>> STACK: Bio::Annotation::Collection::add_Annotation
>>>>> /usr/local/share/perl/5.10.0/Bio/Annotation/Collection.pm:361
>>>>> STACK: Bio::SeqFeature::Annotated::add_Annotation
>>>>> /usr/local/share/perl/5.10.0/Bio/SeqFeature/Annotated.pm:609
>>>>> STACK: Bio::FeatureIO::gff::_handle_non_reserved_tag
>>>>> /usr/local/share/perl/5.10.0/Bio/FeatureIO/gff.pm:797
>>>>> STACK: Bio::FeatureIO::gff::_handle_feature
>>>>> /usr/local/share/perl/5.10.0/Bio/FeatureIO/gff.pm:752
>>>>> STACK: Bio::FeatureIO::gff::next_feature
>>>>> /usr/local/share/perl/5.10.0/Bio/FeatureIO/gff.pm:172
>>>>> STACK: /usr/local/bin/gmod_bulk_load_gff3.pl:775
>>>>> -----------------------------------------------------------
>>>>>
>>>>> Abnormal termination, trying to clean up...
>>>>>
>>>>> Attempting to clean up the loader temp table (so that --recreate_cache
>>>>> won't be needed)...
>>>>> Trying to remove the run lock (so that --remove_lock won't be needed)...
>>>>> Exiting...
>>>>>
>>>>>
>>>>> Thanks so much!
>>>>> -Daniel
>>>>>
>>>>> On Thu, May 20, 2010 at 6:20 PM, Scott Cain <scott at scottcain.net> wrote:
>>>>>> Hi Daniel,
>>>>>>
>>>>>> The error message you got said that the GFF file that you are trying
>>>>>> to load couldn't be found; are you sure the path was correct? The
>>>>>> file itself looks OK.
>>>>>>
>>>>>> Scott
>>>>>>
>>>>>>
>>>>>> On Thu, May 20, 2010 at 5:06 PM, Daniel Quest <daniel.quest at gmail.com> wrote:
>>>>>>> Hello All,
>>>>>>>
>>>>>>> I am trying to load my first genome from maker. Not sure what the
>>>>>>> problem is... any help is awesome! I am attaching at least part of
>>>>>>> the dataset.
>>>>>>>
>>>>>>> -Daniel
>>>>>>>
>>>>>>>
>>>>>>> gmod at ubuntu:~/Cthe/cthe.maker.output/cthe_datastore/Contig125$
>>>>>>> gmod_bulk_load_gff3.pl --organism Cthe -a -g
>>>>>>> /home/gmod/Cthe/cthe.maker.output/cthe_datastore/Contig125.gff3
>>>>>>> --noexon
>>>>>>>
>>>>>>> ------------- EXCEPTION: Bio::Root::Exception -------------
>>>>>>> MSG: Could not open
>>>>>>> /home/gmod/Cthe/cthe.maker.output/cthe_datastore/Contig125.gff3: No
>>>>>>> such file or directory
>>>>>>> STACK: Error::throw
>>>>>>> STACK: Bio::Root::Root::throw /usr/local/share/perl/5.10.0/Bio/Root/Root.pm:368
>>>>>>> STACK: Bio::Root::IO::_initialize_io
>>>>>>> /usr/local/share/perl/5.10.0/Bio/Root/IO.pm:341
>>>>>>> STACK: Bio::FeatureIO::_initialize
>>>>>>> /usr/local/share/perl/5.10.0/Bio/FeatureIO.pm:353
>>>>>>> STACK: Bio::FeatureIO::gff::_initialize
>>>>>>> /usr/local/share/perl/5.10.0/Bio/FeatureIO/gff.pm:102
>>>>>>> STACK: Bio::FeatureIO::new /usr/local/share/perl/5.10.0/Bio/FeatureIO.pm:276
>>>>>>> STACK: Bio::FeatureIO::new /usr/local/share/perl/5.10.0/Bio/FeatureIO.pm:296
>>>>>>> STACK: /usr/local/bin/gmod_bulk_load_gff3.pl:720
>>>>>>> -----------------------------------------------------------
>>>>>>>
>>>>>>> Abnormal termination, trying to clean up...
>>>>>>>
>>>>>>> Trying to remove the run lock (so that --remove_lock won't be needed)...
>>>>>>> Exiting...
>>>>>>> gmod at ubuntu:~/Cthe/cthe.maker.output/cthe_datastore/Contig125$
>>>>>>>
>>>>>>> ------------------------------------------------------------------------------
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Gmod-schema mailing list
>>>>>>> Gmod-schema at lists.sourceforge.net
>>>>>>> https://lists.sourceforge.net/lists/listinfo/gmod-schema
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> ------------------------------------------------------------------------
>>>>>> Scott Cain, Ph. D. scott at scottcain dot net
>>>>>> GMOD Coordinator (http://gmod.org/) 216-392-3087
>>>>>> Ontario Institute for Cancer Research
>>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> ------------------------------------------------------------------------
>>>> Scott Cain, Ph. D. scott at scottcain dot net
>>>> GMOD Coordinator (http://gmod.org/) 216-392-3087
>>>> Ontario Institute for Cancer Research
>>>>
>>>
>>
>>
>>
>> --
>> ------------------------------------------------------------------------
>> Scott Cain, Ph. D. scott at scottcain dot net
>> GMOD Coordinator (http://gmod.org/) 216-392-3087
>> Ontario Institute for Cancer Research
>>
>
--
------------------------------------------------------------------------
Scott Cain, Ph. D. scott at scottcain dot net
GMOD Coordinator (http://gmod.org/) 216-392-3087
Ontario Institute for Cancer Research
More information about the Bioperl-l
mailing list