[Bioperl-l] `get_feature_by_name` not working after migrating to Bio::DB::SeqFeature::Store from a Bio::DB::GFF backend
Vivek Krishnakumar
vivekkrishnakumar at gmail.com
Wed Feb 1 15:14:05 UTC 2012
Hello,
I would like to state my problem: Previously, i was working with a
Bio::DB:GFF backend database and was able to retrieve features from the
database by feature name using the function 'get_feature_by_name()'. But
recently, when I made the switch to Bio::DB::SeqFeature::Store to store the
exact same data points as before in the new db, invoking the
'get_feature_by_name()' function returns nothing.
Please refer to the following code snippet which is supposed to retrieve a
certain gene "locus" feature from the backend database and use that to get
the 'start' and 'end' coordinate of the feature based on the 'strand' on
which it is present.
my ($locus_obj, $gene_models) = get_annotation_db_features($locus,
$gff_dbh);
sub get_annotation_db_features {
my ($locus, $gff_dbh) = @_;
my ($locus_obj) = $gff_dbh->get_feature_by_name('Gene' => '$locus');
my ($end5, $end3) = $locus_obj->strand == 1
? ($locus_obj->start, $locus_obj->end)
: ($locus_obj->end, $locus_obj->start);
my $segment = $gff_dbh->segment($locus_obj->refseq, $end5, $end3);
my @gene_models =
$segment->features('processed_transcript:working_models', -attributes
=> { 'Gene' => $locus });
#will have to sort the gene models
return ($locus_obj, \@gene_models);
}
Also, here is a snippet of the GFF3 file that is used to populate the
backend database:
##gff-version 3
chr2 working_models gene 30427563 30429139 . - .
ID=gene_35804;Note=Zinc transporter;Name=Medtr2g097580
chr2 working_models mRNA 30427563 30429139 . - .
ID=mrna_36255;Parent=gene_35804;Name=Medtr2g097580.1;conf_class=F
chr2 working_models exon 30428491 30429139 . - .
ID=exon_120028;Parent=mrna_36255
chr2 working_models exon 30427563 30428147 . - .
ID=exon_120029;Parent=mrna_36255
chr2 working_models CDS 30428491 30429109 . - 0
ID=cds_120028;Parent=mrna_36255
chr2 working_models CDS 30427756 30428147 . - 2
ID=cds_120029;Parent=mrna_36255
Considering that this gene locus is unique in the entire GFF file, If this
above GFF is loaded into the SeqFeature::Store database, you would expect
that running the following query should yield a count of "1":
SELECT count(f.id) FROM feature as f, name as n WHERE
(n.id=f.id<https://owa.jcvi.org/OWA/redir.aspx?C=6752206bd7374e2483702b789422d7b3&URL=http%3a%2f%2ff.id%2f>
AND
n.name = 'Medtr2g097580' AND n.display_name > 0);
And in my case, it does yield "1".
But if I use the function *get_feature_by_name*, retrieve the locus object
and try to get the strand of this object, I get the following error:
[error] Can't call method "strand" on an undefined value at
get_db_features.pl line 910.
As specified earlier, I do not have any problems if the backend is
Bio::DB::GFF.
>From the SeqFeature::Store documentation, we know that
'get_feature_by_name()' is in place for backward compatibility, I even
tried modifying the code snippet to call 'get_features_by_name()' instead,
but no matter what, I do not get any locus object back from this subroutine!
Could someone please guide me in the right direction and let me know if I
am making any mistakes here when migrating from GFF2 to GFF3?
Thanks in advance.
~ Vivek
More information about the Bioperl-l
mailing list