[Bioperl-l] `get_feature_by_name` not working after migrating to Bio::DB::SeqFeature::Store from a Bio::DB::GFF backend

Lincoln Stein lincoln.stein at gmail.com
Wed Feb 1 15:20:12 UTC 2012


Try not putting single quotation marks around $locus, as you are searching
for a gene named (literally) "$locus". If this doesn't work,
use $gff_dbh->get_features_by_name($locus) instead, as I'm not sure whether
the Class=>Identifier syntax is accepted by Bio::DB::SeqFeature::Store.

Lincoln

On Wed, Feb 1, 2012 at 10:14 AM, Vivek Krishnakumar <
vivekkrishnakumar at gmail.com> wrote:

> Hello,
>
> I would like to state my problem: Previously, i was working with a
> Bio::DB:GFF backend database and was able to retrieve features from the
> database by feature name using the function 'get_feature_by_name()'. But
> recently, when I made the switch to Bio::DB::SeqFeature::Store to store the
> exact same data points as before in the new db, invoking the
> 'get_feature_by_name()' function returns nothing.
>
> Please refer to the following code snippet which is supposed to retrieve a
> certain gene "locus" feature from the backend database and use that to get
> the 'start' and 'end' coordinate of the feature based on the 'strand' on
> which it is present.
>
> my ($locus_obj, $gene_models) = get_annotation_db_features($locus,
> $gff_dbh);
>
> sub get_annotation_db_features {
>    my ($locus, $gff_dbh) = @_;
>
>    my ($locus_obj) = $gff_dbh->get_feature_by_name('Gene' => '$locus');
>
>    my ($end5, $end3) = $locus_obj->strand == 1
>      ? ($locus_obj->start, $locus_obj->end)
>      : ($locus_obj->end, $locus_obj->start);
>
>    my $segment = $gff_dbh->segment($locus_obj->refseq, $end5, $end3);
>    my @gene_models =
>      $segment->features('processed_transcript:working_models', -attributes
> => { 'Gene' => $locus });
>
>    #will have to sort the gene models
>    return ($locus_obj, \@gene_models);
> }
>
> Also, here is a snippet of the GFF3 file that is used to populate the
> backend database:
> ##gff-version 3
> chr2    working_models  gene    30427563    30429139    .   -   .
> ID=gene_35804;Note=Zinc transporter;Name=Medtr2g097580
> chr2    working_models  mRNA    30427563    30429139    .   -   .
> ID=mrna_36255;Parent=gene_35804;Name=Medtr2g097580.1;conf_class=F
> chr2    working_models  exon    30428491    30429139    .   -   .
> ID=exon_120028;Parent=mrna_36255
> chr2    working_models  exon    30427563    30428147    .   -   .
> ID=exon_120029;Parent=mrna_36255
> chr2    working_models  CDS 30428491    30429109    .   -   0
> ID=cds_120028;Parent=mrna_36255
> chr2    working_models  CDS 30427756    30428147    .   -   2
> ID=cds_120029;Parent=mrna_36255
>
> Considering that this gene locus is unique in the entire GFF file, If this
> above GFF is loaded into the SeqFeature::Store database, you would expect
> that running the following query should yield a count of "1":
>
> SELECT count(f.id) FROM feature as f, name as n WHERE
> (n.id=f.id<
> https://owa.jcvi.org/OWA/redir.aspx?C=6752206bd7374e2483702b789422d7b3&URL=http%3a%2f%2ff.id%2f
> >
> AND
> n.name = 'Medtr2g097580' AND n.display_name > 0);
>
> And in my case, it does yield "1".
>
> But if I use the function *get_feature_by_name*, retrieve the locus object
> and try to get the strand of this object, I get the following error:
>
> [error] Can't call method "strand" on an undefined value at
> get_db_features.pl line 910.
>
> As specified earlier, I do not have any problems if the backend is
> Bio::DB::GFF.
>
> >From the SeqFeature::Store documentation, we know that
> 'get_feature_by_name()' is in place for backward compatibility, I even
> tried modifying the code snippet to call 'get_features_by_name()' instead,
> but no matter what, I do not get any locus object back from this
> subroutine!
>
> Could someone please guide me in the right direction and let me know if I
> am making any mistakes here when migrating from GFF2 to GFF3?
> Thanks in advance.
> ~ Vivek
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>



-- 
Lincoln D. Stein
Director, Informatics and Biocomputing Platform
Ontario Institute for Cancer Research
101 College St., Suite 800
Toronto, ON, Canada M5G0A3
416 673-8514
Assistant: Renata Musa <Renata.Musa at oicr.on.ca>



More information about the Bioperl-l mailing list