[BioSQL-l] CDS feature for genome bioentry

Hilmar Lapp hlapp at gnf.org
Mon Jul 14 11:37:08 EDT 2003


On Monday, July 14, 2003, at 01:20  AM, albert vilella wrote:

>
> What I'm trying to obtain is one CDS of a bioentry, the one that has,
> for example, the field /gene="rpL31":
>
> If I have entered the complete genome of, say, Mycoplasma pneumoniae,
> this corresponds to one bioentry in the database (U00089).
>
> Now I want to obtain the aa sequence (or the complete CDS and then
> extract the sequence) that is identified by, for example, 
> /gene="rpL31".
>
> This corresponds (if I'm right) to one feature in the bioentry (well,
> two seqfeatures: the gene and the corresponding CDS).
>

The CDS should have a /gene tag too, doesn't it? Or some other tag that 
identifies which gene it is the CDS for?


> So the query would be: 'Give me the CDS whose gene field is
> /gene="rpL31"'.
>
> How would I do that with BioQuery?
>

That's a good use case. Not sure it's going to work, but let's see. 
Note that features in biosql do not have sequences, so if you want the 
dna sequence you need the bioentry too, or for the translation you 
could go for the /translation tag directly. Since you want protein 
sequence at the end, let's try to go for the feature:

	$query = Bio::DB::Query::BioQuery->new(
                -datacollections => ["Bio::SeqFeatureI f", # let's 
define an alias
                                     
"Bio::Ontology::Term=>Bio::SeqFeatureI tt::primary_tag",
	                                "Bio::Annotation::SimpleValue sv", # 
let's define alias
	                                
"Bio::Annotation::SimpleValue<=>Bio::SeqFeatureI"],
                -where => ["tt.name = 'CDS'",
	                      "sv.tagname = 'gene'", # or whatever the 
identifying tag is
	                      "sv.value = 'rpL31'"]); # or whatever the value 
for that tag is

	$adp = $db->get_object_adaptor("Bio::SeqFeatureI");
	$result = $adp->find_by_query($query);
	
The translation would then be in the resulting feature(s)'s 
'translation' tag.

I don't know whether the above query will work - but I'd be very 
interested in the result. I'm inclined to postulate that such queries 
must work or be made to work or otherwise the value of that query 
system is very limited and not worth advocating other than for 
bioperl-db internal use ...

	-hilmar

> If it's not possible to construct such query with BioQuery, but either
> using SqlQuery, how could I obtain a $cds object of the same kind?
>
> The example from the last email (again, if I'm right), only finds
> complete bioentries, not a specific CDS contained in a bioentry, right?
> Because I have tried it and the $result is always empty, except if I
> change the query to obtain the complete bioentry (by its
> accession_number or whatever).
>
> So, summing up, the thing here is that for one big bioentry (a complete
> genome, or a chromosome, etc), I want to query for its CDS's...
>
> Thanks in advance and sorry for the noise,
>
> Albert
>
> _______________________________________________
> BioSQL-l mailing list
> BioSQL-l at open-bio.org
> http://open-bio.org/mailman/listinfo/biosql-l
>
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------



More information about the BioSQL-l mailing list