[Bioperl-l] Finding all bioactive substances through EUtils or PUG_SOAP

bar tomas bartomas at gmail.com
Sat Jul 18 18:38:11 UTC 2009

Dear Chris,
Thank you again for you helpful reply and your code.
I've been trying to find a way to extend your BioPerl code to be able to
retrieve the NCBI Taxonomy db IDs of the species in which the bioactive
compounds are found.
(The query that I'm interested in, is to find bioactive compounds found in
natural organisms. I'd like to identify the species where the nautral
compounds are found).
I've looked in the web page you mention in your mail (
and have found a linking filter *pcassay_taxonomy *for the bioassay
database, but I think(?) that this does not refer to the taxonomy of the
species in which the active screened compound is found.

Do you know if it is possible to retrieve the link between a natural
compound and the species in which the compound can be found?

Thanks very much for any help or hints.

(sorrys if the email is a bit misplaced in this discussion list as it is not
really specific to bioperl, although I'm trying to implement it using
Bioperl tools. I have not been able to find a general discussion list about
querying Entrez databases, unspecific to any particular proramming

Thanks again

Tomas Bar

On 7/15/09, Chris Fields <cjfields at illinois.edu> wrote:
> Tomas B.,
> Just so you know, this isn't really a bioperl-specific question, though you
> may be able to use bioperl tools to do what you want.  I'll run with the
> latter assumption.
> I'm not too familiar with pubchem and related, but using einfo you can get
> relevant information on the databases.  The available databases are:
> pcassay
> pccompound
> pcsubstance
> Lots of filters available, summarized here:
> http://pubchem.ncbi.nlm.nih.gov/help.html#PubChem_index
> My guess is you would have to query the database pcassay with esearch and
> the appropriate filter to find the IDs active for a particular assay, then
> use elink from pcassay to either pccompound or pcsubstance to get what you
> want.
> Using Bio::DB::EUtilities (below) this worked to get the compound IDs, you
> could probably get more information using esummary (not sure if you can
> retrieve all info on them).
> chris
> ==========================================
> #!/usr/bin/perl -w
> use strict;
> use warnings;
> use Bio::DB::EUtilities;
> my $term = '"Luciferase Profiling Assay"';
> my $factory = Bio::DB::EUtilities->new(-eutil   => 'esearch',
>                                       -db      => 'pcassay',
>                                       -term    => $term,
>                                       -verbose => 1,
>                                       -retmax  => 100);
> my @ids = $factory->get_ids;
> # note the linkname, can use same for pcsubstance
> $factory->reset_parameters(-eutil       => 'elink',
>                           -db          => 'pccompound',
>                           -dbfrom      => 'pcassay',
>                           -linkname    => 'pcassay_pccompound_active',
>                           -id          => \@ids);
> $factory->print_all;
> ==========================================
> chris
> On Jul 15, 2009, at 8:40 AM, bar tomas wrote:
>  Hi,
>> Could you give me a hint on how to query Entrez databases to find all
>> substances that have been found to be bioactive through a bioassay
>> screening.
>> I've looked at the wsdl file for querying pubchem (*
>> http://pubchem.ncbi.nlm.nih.gov/pug_soap/pug_soap.cgi?wsdl* ) but have
>> found
>> no service for retrieving substance ids.
>> Is there a way to do this with EUtils or a http query with parameters ?
>> Thanks a lot.
>> Tomas B.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l

More information about the Bioperl-l mailing list