[BioSQL-l] Re: BioSQL-l Digest, Vol 7, Issue 5

albert vilella vilella at bio.ub.es
Mon Jul 14 09:19:21 EDT 2003


> > One example is if I'm trying to get the aminoacidic sequence of a CDS
> > (located as a seqfeature_qualifier_value of type 15) by it's gene
> > modifier (like /gene="rpL31"). How would that query be constructed?
> 
> In this case you want to *obtain* the amino acid sequence, not 
> *constrain* by it, right? Assuming that rpL31 is the display_id of the 
> sequence object, you'd do
> 
> 	$query = Bio::DB::Query::BioQuery->new(
> 	                 -datacollections => ["Bio::SeqI seq"],
> 	                 -where           => ["seq.display_id = 'rpL31'"]);
> 	$result = $objadap->find_by_query($query);
> 	while(my $seq = $result->next_object()) {
> 		print $seq->accession_number,"\t",$seq->description,"\n";
> 		foreach my $cds (grep { $_->primary_tag eq 'CDS'; } 
> $seq->get_SeqFeatures()) {
> 			my $aaseq = $cds->spliced_seq->translate();
> 			# or write to a fasta SeqIO output stream
> 			print $aaseq->seq(),"\n";
> 		}
> 	}

What I'm trying to obtain is one CDS of a bioentry, the one that has, 
for example, the field /gene="rpL31":

If I have entered the complete genome of, say, Mycoplasma pneumoniae,
this corresponds to one bioentry in the database (U00089).

Now I want to obtain the aa sequence (or the complete CDS and then
extract the sequence) that is identified by, for example, /gene="rpL31".

This corresponds (if I'm right) to one feature in the bioentry (well,
two seqfeatures: the gene and the corresponding CDS).

So the query would be: 'Give me the CDS whose gene field is
/gene="rpL31"'.

How would I do that with BioQuery?

If it's not possible to construct such query with BioQuery, but either
using SqlQuery, how could I obtain a $cds object of the same kind?

The example from the last email (again, if I'm right), only finds
complete bioentries, not a specific CDS contained in a bioentry, right?
Because I have tried it and the $result is always empty, except if I
change the query to obtain the complete bioentry (by its
accession_number or whatever).

So, summing up, the thing here is that for one big bioentry (a complete
genome, or a chromosome, etc), I want to query for its CDS's...

Thanks in advance and sorry for the noise,

Albert



More information about the BioSQL-l mailing list