[BioSQL-l] CDS feature for genome bioentry

albert vilella vilella at bio.ub.es
Mon Jul 14 22:27:45 EDT 2003


> The CDS should have a /gene tag too, doesn't it? Or some other tag that 
> identifies which gene it is the CDS for?

Yes, usually the same /gene tag is present in the gene entry and in the
subsequent CDS entry. But I'm not sure if this is mandatory.
> 
> 
> > So the query would be: 'Give me the CDS whose gene field is
> > /gene="rpL31"'.
> >
> > How would I do that with BioQuery?
> >
> 
> That's a good use case. Not sure it's going to work, but let's see. 
> Note that features in biosql do not have sequences, so if you want the 
> dna sequence you need the bioentry too, or for the translation you 
> could go for the /translation tag directly. Since you want protein 
> sequence at the end, let's try to go for the feature:
> 
> 	$query = Bio::DB::Query::BioQuery->new(
>                 -datacollections => ["Bio::SeqFeatureI f", # let's define an alias
>                                      
> "Bio::Ontology::Term=>Bio::SeqFeatureI tt::primary_tag",
I changed this line for the one below, because Bio::Ontology::Term was
missing in $mapper:

------------- EXCEPTION  -------------
MSG: failed to map Bio::Ontology::Term to a table
STACK Bio::DB::Query::BioQuery::translate_query
/usr/lib/perl5/site_perl/5.8.0/Bio/DB/Query/BioQuery.pm:247
STACK Bio::DB::BioSQL::BaseDriver::translate_query
/usr/lib/perl5/site_perl/5.8.0/Bio/DB/BioSQL/BaseDriver.pm:1097
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_query
/usr/lib/perl5/site_perl/5.8.0/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:1154
STACK toplevel albert_query.pl:24
--------------------------------------

"Bio::Ontology::TermI=>Bio::SeqFeatureI tt::primary_tag", 

# I changed this simply because is the only thing in $mapper that seemed
to refer to Bio::Ontology::Term. The code runs without problem with this
until...
> 	                                "Bio::Annotation::SimpleValue sv", # let's define alias
> 	                                
> "Bio::Annotation::SimpleValue<=>Bio::SeqFeatureI"],
>                 -where => ["tt.name = 'CDS'",
> 	                      "sv.tagname = 'gene'", # or whatever the identifying tag is
> 	                      "sv.value = 'rpL31'"]); # or whatever the value for that tag is
> 
> 	$adp = $db->get_object_adaptor("Bio::SeqFeatureI");
> 	$result = $adp->find_by_query($query);

... now the problem is with Bio::Annotation::SimpleValue::primary_tag:

------------- EXCEPTION  -------------
MSG: failed to map Bio::Annotation::SimpleValue::primary_tag to a FK
STACK Bio::DB::Query::BioQuery::translate_query
/usr/lib/perl5/site_perl/5.8.0/Bio/DB/Query/BioQuery.pm:220
STACK Bio::DB::BioSQL::BaseDriver::translate_query
/usr/lib/perl5/site_perl/5.8.0/Bio/DB/BioSQL/BaseDriver.pm:1097
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_query
/usr/lib/perl5/site_perl/5.8.0/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:1154
STACK toplevel albert_query.pl:24
--------------------------------------

And here I have no idea about what to change.
Any guess?

> 	
> The translation would then be in the resulting feature(s)'s 
> 'translation' tag.
> 
> I don't know whether the above query will work - but I'd be very 
> interested in the result. I'm inclined to postulate that such queries 
> must work or be made to work

Music for my ears :-) I'm happy to know that what I'm trying to query is
logically feasible.

Thanks in advance,

Albert

-----
complete script:
---------------------------------------------------------------------------
use Bio::DB::Query::BioQuery;
use Bio::DB::BioDB;
 
my $dbadap= Bio::DB::BioDB->new(
                                -database => 'biosql',
                                -dbname   => 'biosql',
                                -user => 'root',
                                -pass => 'mypassword',
                                -driver => 'mysql');
 
$query = Bio::DB::Query::BioQuery->new(-datacollections =>
                                       ["Bio::SeqFeatureI f",
                                       
"Bio::Ontology::TermI=>Bio::SeqFeatureI tt::primary_tag",
                                        "Bio::Annotation::SimpleValue
sv",
                                       
"Bio::Annotation::SimpleValue<=>Bio::SeqFeatureI"],
                                       -where => ["tt.name = 'CDS'",
                                                  "sv.tagname = 'gene'",
                                                  "sv.value =
'rpL31'"]);
$objadap = $dbadap->get_object_adaptor("Bio::SeqFeatureI");
$result = $objadap->find_by_query($query); # <= fails here
 
print $result;
 
my $seq = $result->next_object(); # <= I dont really know if this still
applies.
 
print $seq;
---------------------------------------------------------------------------

>  or otherwise the value of that query 
> system is very limited and not worth advocating other than for 
> bioperl-db internal use ...
> 
> 	-hilmar
> 
> > If it's not possible to construct such query with BioQuery, but either
> > using SqlQuery, how could I obtain a $cds object of the same kind?
> >
> > The example from the last email (again, if I'm right), only finds
> > complete bioentries, not a specific CDS contained in a bioentry, right?
> > Because I have tried it and the $result is always empty, except if I
> > change the query to obtain the complete bioentry (by its
> > accession_number or whatever).
> >
> > So, summing up, the thing here is that for one big bioentry (a complete
> > genome, or a chromosome, etc), I want to query for its CDS's...
> >
> > Thanks in advance and sorry for the noise,
> >
> > Albert
> >
> > _______________________________________________
> > BioSQL-l mailing list
> > BioSQL-l at open-bio.org
> > http://open-bio.org/mailman/listinfo/biosql-l
> >



More information about the BioSQL-l mailing list