[Bioperl-l] `get_feature_by_name` not working after migrating to Bio::DB::SeqFeature::Store from a Bio::DB::GFF backend

Vivek Krishnakumar vivekkrishnakumar at gmail.com
Wed Feb 1 16:51:46 UTC 2012


Hi Scott,

Thanks very much for your suggestions. Looks I did miss it somehow
(confusion was caused because I was using both bioperl-l at googlegroups and
bioperl-l at open-bio)

Anyway, I had modified my function exactly like your suggestion:
my ($locus_obj) = $gff_dbh->get_features_by_name(-name => $locus, -type =>
'gene');

But doing so just returns the following error:

-------------------- EXCEPTION --------------------
MSG: segment() called in a scalar context but multiple features match.
Either call in a list context or narrow your search using the -types
or -class arguments

STACK Bio::DB::SeqFeature::Store::segment
/usr/local/packages/perl-5.10.1/lib/5.10.1/Bio/DB/SeqFeature/Store.pm:1322
STACK main::get_annotation_db_features
/opt/www/medicago/cgi-bin/medicago/eucap/eucap.pl:899
STACK main::structural_annotation
/opt/www/medicago/cgi-bin/medicago/eucap/eucap.pl:660
STACK toplevel /opt/www/medicago/cgi-bin/medicago/eucap/eucap.pl:119
-------------------------------------------

which would suggest to oneself that there are several such features with
the same ID. But in fact, I was able to verify by querying the database
that I have only one such locus.

As for your question regarding how $locus is populated, it is populated
from a CGI parameter passed to the script. I know that I am only passing it
one locus ID. And as I mentioned earlier in this thread, the warning
statement I inserted before making the function call shows me that there is
only one ID in the $locus variable.

My last resort now is to try as you suggested and modify my GFF3 file and
embed the -class => 'Gene' into the "Name" attribute. While doing so,
should I also embed the 'mRNA' class into the "Name" attribute of the mRNA
feature like so:

chr2    working_models  mRNA    30427563    30429139    .   -   .
ID=mrna_36255;Parent=gene_35804;Name=mRNA:Medtr2g097580.1;conf_class=F

Subsequently, should I modify the function call to include the 'class':
my ($locus_obj) = $gff_dbh->get_features_by_name(-name => $locus, -type =>
'gene', -class => 'Gene');

Thank you.
Vivek

On Wed, Feb 1, 2012 at 11:26 AM, Scott Cain <scott at scottcain.net> wrote:

> Hi Vivek,
>
> I responded to your original email and I suspect you may have missed
> it.  I'll copy it below.  Another few things: how does $locus get
> populated?  Are you sure what you expect to be there is?
>
> Also, to answer your question about the other bioperl modules you're
> using: no, I don't think that's interfering.
>
> Scott
>
> ------------------------------------------------
> Hello Vivek,
>
> In your GFF3, you don't have any features of class "Gene".  In GFF2,
> the class was the text string that started the ninth column, like
> this:
>
> chr2   .  gene    30427563    30429139    .   -   .  Gene 35804
>
> where the class would be Gene and the name (also called group) would
> be 35804.  Class is not a particularly well defined concept in GFF3,
> so the easiest way to restore functionality to your script is to
> change the call from this:
>
>  my ($locus_obj) = $gff_dbh->get_feature_by_name('Gene' => $locus);
>
> to this:
>
>  my ($locus_obj) = $gff_dbh->get_features_by_name(-name => $locus,
> -type => 'gene');
>
> I believe (though haven't tested it myself in a very long time) that
> you can embed class in the name of the feature, like this:
>
> chr2  .  gene    30427563    30429139    .   -   .  Name=Gene:Medtr2g097580
>
> which may or may not be easier, depending on your data and your code base.
>
> Scott
>
>
> On Wed, Feb 1, 2012 at 10:50 AM, Vivek Krishnakumar
> <vivekkrishnakumar at gmail.com> wrote:
> > Hi Lincoln,
> >
> > Thanks very much for your suggestions. Not sure how the single quotation
> > marks appeared around the $locus variable. But looks like it was only in
> > the email. Fortunately did not have quotes around the variable in my
> > original code.
> >
> > Now, when I switch over to 'get_features_by_name()', my script does not
> run
> > to completion.
> >
> > I want to mention that this snippet of code is part of a larger CGI
> script
> > that interfaces with the SeqFeature backend DB. When I modify the
> function
> > call to $gff_dbh->get_features_by_name($locus), the script just runs
> > indefinitely and returns absolutely nothing. I did put in a warn
> statement
> > to see if the correct locus ID is being passed to the function (I am able
> > to see the warning message in my apache error log), which seems to be
> fine.
> > But the moment it reaches the function call step, the CGI script freezes
> up
> > and I am unable to do anything. It just ends up as a rogue process owned
> by
> > the 'daemon' user and continues to use up a lot of memory.
> >
> > I am using the following BioPerl modules in this CGI script:
> > use Bio::SeqIO;
> > use Bio::SearchIO;
> > use Bio::DB::SeqFeature::Store;
> > use Bio::SeqFeature::Generic;
> > use Bio::Graphics;
> > use Bio::Graphics::Feature;
> >
> > Could any of these be interfering with get_features_by_name()?
> >
> > Thank you.
> > Vivek
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
> --
> ------------------------------------------------------------------------
> Scott Cain, Ph. D.                                   scott at scottcain
> dot net
> GMOD Coordinator (http://gmod.org/)                     216-392-3087
> Ontario Institute for Cancer Research
>



More information about the Bioperl-l mailing list