[BioSQL-l] CDS feature for genome bioentry
Hilmar Lapp
hlapp at gnf.org
Mon Jul 14 11:37:08 EDT 2003
On Monday, July 14, 2003, at 01:20 AM, albert vilella wrote:
>
> What I'm trying to obtain is one CDS of a bioentry, the one that has,
> for example, the field /gene="rpL31":
>
> If I have entered the complete genome of, say, Mycoplasma pneumoniae,
> this corresponds to one bioentry in the database (U00089).
>
> Now I want to obtain the aa sequence (or the complete CDS and then
> extract the sequence) that is identified by, for example,
> /gene="rpL31".
>
> This corresponds (if I'm right) to one feature in the bioentry (well,
> two seqfeatures: the gene and the corresponding CDS).
>
The CDS should have a /gene tag too, doesn't it? Or some other tag that
identifies which gene it is the CDS for?
> So the query would be: 'Give me the CDS whose gene field is
> /gene="rpL31"'.
>
> How would I do that with BioQuery?
>
That's a good use case. Not sure it's going to work, but let's see.
Note that features in biosql do not have sequences, so if you want the
dna sequence you need the bioentry too, or for the translation you
could go for the /translation tag directly. Since you want protein
sequence at the end, let's try to go for the feature:
$query = Bio::DB::Query::BioQuery->new(
-datacollections => ["Bio::SeqFeatureI f", # let's
define an alias
"Bio::Ontology::Term=>Bio::SeqFeatureI tt::primary_tag",
"Bio::Annotation::SimpleValue sv", #
let's define alias
"Bio::Annotation::SimpleValue<=>Bio::SeqFeatureI"],
-where => ["tt.name = 'CDS'",
"sv.tagname = 'gene'", # or whatever the
identifying tag is
"sv.value = 'rpL31'"]); # or whatever the value
for that tag is
$adp = $db->get_object_adaptor("Bio::SeqFeatureI");
$result = $adp->find_by_query($query);
The translation would then be in the resulting feature(s)'s
'translation' tag.
I don't know whether the above query will work - but I'd be very
interested in the result. I'm inclined to postulate that such queries
must work or be made to work or otherwise the value of that query
system is very limited and not worth advocating other than for
bioperl-db internal use ...
-hilmar
> If it's not possible to construct such query with BioQuery, but either
> using SqlQuery, how could I obtain a $cds object of the same kind?
>
> The example from the last email (again, if I'm right), only finds
> complete bioentries, not a specific CDS contained in a bioentry, right?
> Because I have tried it and the $result is always empty, except if I
> change the query to obtain the complete bioentry (by its
> accession_number or whatever).
>
> So, summing up, the thing here is that for one big bioentry (a complete
> genome, or a chromosome, etc), I want to query for its CDS's...
>
> Thanks in advance and sorry for the noise,
>
> Albert
>
> _______________________________________________
> BioSQL-l mailing list
> BioSQL-l at open-bio.org
> http://open-bio.org/mailman/listinfo/biosql-l
>
--
-------------------------------------------------------------
Hilmar Lapp email: lapp at gnf.org
GNF, San Diego, Ca. 92121 phone: +1-858-812-1757
-------------------------------------------------------------
More information about the BioSQL-l
mailing list