[Bioperl-l] `get_feature_by_name` not working after migrating to Bio::DB::SeqFeature::Store from a Bio::DB::GFF backend

Scott Cain scott at scottcain.net
Wed Feb 1 19:44:56 UTC 2012


/scott realizes he should refresh his mail client before sending
helpful emails :-)


On Wed, Feb 1, 2012 at 2:43 PM, Scott Cain <scott at scottcain.net> wrote:
> Hi Vivek,
>
> What I don't understand from the error stack that you copied in your
> email is that it doesn't mention the get_features_by_name method,
> rather it mentions the segment method--are you sure that the line
> number reported in your get_annotation_db_features method corresponds
> to the change you made in the get_features_by_name call?  I get the
> feeling that it is dying somewhere else now.  I would say that
> resorting to adding class to the feature names is unlikely to help.
>
> Scott
>
>
> On Wed, Feb 1, 2012 at 11:51 AM, Vivek Krishnakumar
> <vivekkrishnakumar at gmail.com> wrote:
>> Hi Scott,
>>
>> Thanks very much for your suggestions. Looks I did miss it somehow
>> (confusion was caused because I was using both bioperl-l at googlegroups and
>> bioperl-l at open-bio)
>>
>> Anyway, I had modified my function exactly like your suggestion:
>>
>> my ($locus_obj) = $gff_dbh->get_features_by_name(-name => $locus, -type =>
>> 'gene');
>>
>> But doing so just returns the following error:
>>
>> -------------------- EXCEPTION --------------------
>> MSG: segment() called in a scalar context but multiple features match.
>> Either call in a list context or narrow your search using the -types or
>> -class arguments
>>
>> STACK Bio::DB::SeqFeature::Store::segment
>> /usr/local/packages/perl-5.10.1/lib/5.10.1/Bio/DB/SeqFeature/Store.pm:1322
>> STACK main::get_annotation_db_features
>> /opt/www/medicago/cgi-bin/medicago/eucap/eucap.pl:899
>> STACK main::structural_annotation
>> /opt/www/medicago/cgi-bin/medicago/eucap/eucap.pl:660
>> STACK toplevel /opt/www/medicago/cgi-bin/medicago/eucap/eucap.pl:119
>> -------------------------------------------
>>
>> which would suggest to oneself that there are several such features with the
>> same ID. But in fact, I was able to verify by querying the database that I
>> have only one such locus.
>>
>> As for your question regarding how $locus is populated, it is populated from
>> a CGI parameter passed to the script. I know that I am only passing it one
>> locus ID. And as I mentioned earlier in this thread, the warning statement I
>> inserted before making the function call shows me that there is only one ID
>> in the $locus variable.
>>
>> My last resort now is to try as you suggested and modify my GFF3 file and
>> embed the -class => 'Gene' into the "Name" attribute. While doing so, should
>> I also embed the 'mRNA' class into the "Name" attribute of the mRNA feature
>> like so:
>>
>> chr2    working_models  mRNA    30427563    30429139    .   -   .
>> ID=mrna_36255;Parent=gene_35804;Name=mRNA:Medtr2g097580.1;conf_class=F
>>
>> Subsequently, should I modify the function call to include the 'class':
>> my ($locus_obj) = $gff_dbh->get_features_by_name(-name => $locus, -type =>
>> 'gene', -class => 'Gene');
>>
>> Thank you.
>> Vivek
>>
>>
>> On Wed, Feb 1, 2012 at 11:26 AM, Scott Cain <scott at scottcain.net> wrote:
>>>
>>> Hi Vivek,
>>>
>>> I responded to your original email and I suspect you may have missed
>>> it.  I'll copy it below.  Another few things: how does $locus get
>>> populated?  Are you sure what you expect to be there is?
>>>
>>> Also, to answer your question about the other bioperl modules you're
>>> using: no, I don't think that's interfering.
>>>
>>> Scott
>>>
>>> ------------------------------------------------
>>> Hello Vivek,
>>>
>>> In your GFF3, you don't have any features of class "Gene".  In GFF2,
>>> the class was the text string that started the ninth column, like
>>> this:
>>>
>>> chr2   .  gene    30427563    30429139    .   -   .  Gene 35804
>>>
>>> where the class would be Gene and the name (also called group) would
>>> be 35804.  Class is not a particularly well defined concept in GFF3,
>>> so the easiest way to restore functionality to your script is to
>>> change the call from this:
>>>
>>>  my ($locus_obj) = $gff_dbh->get_feature_by_name('Gene' => $locus);
>>>
>>> to this:
>>>
>>>  my ($locus_obj) = $gff_dbh->get_features_by_name(-name => $locus,
>>> -type => 'gene');
>>>
>>> I believe (though haven't tested it myself in a very long time) that
>>> you can embed class in the name of the feature, like this:
>>>
>>> chr2  .  gene    30427563    30429139    .   -   .
>>>  Name=Gene:Medtr2g097580
>>>
>>> which may or may not be easier, depending on your data and your code base.
>>>
>>> Scott
>>>
>>>
>>> On Wed, Feb 1, 2012 at 10:50 AM, Vivek Krishnakumar
>>> <vivekkrishnakumar at gmail.com> wrote:
>>> > Hi Lincoln,
>>> >
>>> > Thanks very much for your suggestions. Not sure how the single quotation
>>> > marks appeared around the $locus variable. But looks like it was only in
>>> > the email. Fortunately did not have quotes around the variable in my
>>> > original code.
>>> >
>>> > Now, when I switch over to 'get_features_by_name()', my script does not
>>> > run
>>> > to completion.
>>> >
>>> > I want to mention that this snippet of code is part of a larger CGI
>>> > script
>>> > that interfaces with the SeqFeature backend DB. When I modify the
>>> > function
>>> > call to $gff_dbh->get_features_by_name($locus), the script just runs
>>> > indefinitely and returns absolutely nothing. I did put in a warn
>>> > statement
>>> > to see if the correct locus ID is being passed to the function (I am
>>> > able
>>> > to see the warning message in my apache error log), which seems to be
>>> > fine.
>>> > But the moment it reaches the function call step, the CGI script freezes
>>> > up
>>> > and I am unable to do anything. It just ends up as a rogue process owned
>>> > by
>>> > the 'daemon' user and continues to use up a lot of memory.
>>> >
>>> > I am using the following BioPerl modules in this CGI script:
>>> > use Bio::SeqIO;
>>> > use Bio::SearchIO;
>>> > use Bio::DB::SeqFeature::Store;
>>> > use Bio::SeqFeature::Generic;
>>> > use Bio::Graphics;
>>> > use Bio::Graphics::Feature;
>>> >
>>> > Could any of these be interfering with get_features_by_name()?
>>> >
>>> > Thank you.
>>> > Vivek
>>> > _______________________________________________
>>> > Bioperl-l mailing list
>>> > Bioperl-l at lists.open-bio.org
>>> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>>
>>> --
>>> ------------------------------------------------------------------------
>>> Scott Cain, Ph. D.                                   scott at scottcain
>>> dot net
>>> GMOD Coordinator (http://gmod.org/)                     216-392-3087
>>> Ontario Institute for Cancer Research
>>
>>
>
>
>
> --
> ------------------------------------------------------------------------
> Scott Cain, Ph. D.                                   scott at scottcain dot net
> GMOD Coordinator (http://gmod.org/)                     216-392-3087
> Ontario Institute for Cancer Research



-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research




More information about the Bioperl-l mailing list