MeSH terms in bioperl RE: [Bioperl-l] (no subject)
Hilmar Lapp
hlapp at gnf.org
Mon Jul 28 20:48:07 EDT 2003
If I'm not mistaken MeSH is an ontology. I can imagine it'd be pretty
useful to pull down MeSH into Bio::Ontology objects. For simple queries
you don't usually pull down the entire graph though I guess, so it'd
just be Bio::Ontology::Term objects. Does that make sense Heikki?
Is there a complete download as well for which one could write a parser
under Bio::OntologyIO?
-hilmar
BTW just for consistent style, we said a while ago that we'd deprecate
the each_XXX style in favor of get_XXXXs()/add_XXXX()/remove_XXXXs().
On Monday, July 28, 2003, at 04:10 AM, Heikki Lehvaslaiho wrote:
> Christophe,
>
> OK, the preliminary modules are now in the main trunk.
>
> I decided to use the NLM server as it gives a richer selection of
> options. The implementation allows you to retrieve entries using UID or
> exact terms, only.
>
> If there is interest, it would be easy enough to extend it to find
> terms
> based on substrings, which might be useful. An other useful extension
> would be to parse synonyms ('Entry Term') from the output.
>
> Since a term can appear in more than one position in the MeSH tree
> hierarchy, a retrieved term has one or more 'twigs' listing parent,
> sister and child terms that define its role. Any of these other term
> strings can be then used to navigate the tree.
>
>
> use Bio::DB::MeSH;
> my $meshdb = new Bio::DB::MeSH();
> # fetch a Bio::Phenotype::MeSH object
> my $term=$meshdb->get_exact_term('Down Syndrome');
> print $term->description, "\n";
>
> my @parents = $term->each_parent(); # array of term strings
> my @roles = $term->each_twig();
> # array of Bio::Phenotype::MeSH::Twig objects
>
>
> Enjoy,
>
> -Heikki
>
>
>
>
>
>
> On Fri, 2003-07-25 at 12:11, Christophe Bouvard wrote:
>>> Christophe,
>>>
>>> There is nothing in Bioperl that could do it.
>>>
>>> However, adding it is not too difficult. Emulating the MeSH browser
>>> page
>>> at http://www.nlm.nih.gov/mesh/MBrowser.html is easy.
>>>
>>> Is this the only public server?
>>
>> You can access to MeSH either with NCBI's Entrez or with the MeSH
>> browser.
>>
>> For the butter :
>> http://www.nlm.nih.gov/cgi/mesh/2003/MB_cgi?field=uid&term=D002079
>> http://www.ncbi.nih.gov/entrez/
>> query.fcgi?cmd=Text&db=mesh&uid=68002079
>>
>>
>> So I can write a Perl program that reads and parses the page located
>> at the
>> URL
>> http://www.nlm.nih.gov/cgi/mesh/2003/MB_cgi?field=uid&term=XXXXXXX
>> (where
>> XXXXXXX is the unique MeSH ID).
>> For instance, let's have a look to the source of the page
>> http://www.nlm.nih.gov/cgi/mesh/2003/MB_cgi?field=uid&term=D002079 :
>>
>> 8<--------------------------------------------------------------------
>> -----
>> [...]
>> <TR><TH align=left>MeSH Heading</TH><TD>Butter</TD></TR>
>> <TR><TH align=left>Tree Number</TH><TD><A
>> HREF="#TreeD10.516.212.302.199">D10.516.212.302.199</A></TD></TR>
>> <TR><TH align=left>Tree Number</TH><TD><A
>> HREF="#TreeJ02.500.350.100">J02.500.350.100</A></TD></TR>
>> <TR><TH align=left>Tree Number</TH><TD><A
>> HREF="#TreeJ02.500.375.200">J02.500.375.200</A></TD></TR>
>> <TR><TH align=left>Annotation</TH><TD>a dairy product & dietary fat;
>> <A
>> href="/cgi/mesh/2003/MB_cgi?term=MARGARINE">
>> MARGARINE</A> is also available</TD></TR>
>> <TR><TH align=left>Scope Note</TH><TD>The fatty portion of milk,
>> separated
>> as a soft yellowish solid when milk or cream is churned. It is
>> processed for
>> cooking and table use. (Random House Unabridged Dictionary, 2d
>> ed)</TD></TR>
>> [...]
>> <TR><TH align=left>CAS Type 1 Name</TH><TD>Butter</TD></TR>
>> <TR><TH align=left>Registry Number</TH><TD>8029-34-3</TD></TR>
>> <TR><TH align=left>Unique ID</TH><TD>D002079</TD></TR>
>> [...]
>> ----------------------------------------------------------------------
>> --->8
>>
>> It is easy to parse this kind of simple HTML page.
>>
>> Or I can write a program that reads and parses the NCBI's Entrez page
>> located at the address
>> http://www.ncbi.nih.gov/entrez/query.fcgi?cmd=Text&db=mesh&uid=XXXXXXX
>> (where XXXXXXX is the unique NCBI ID and NOT the unique MeSH ID).
>> For example, this is the source of the page
>> http://www.ncbi.nih.gov/entrez/
>> query.fcgi?cmd=Text&db=mesh&uid=68002079
>> (Butter):
>>
>> 8<--------------------------------------------------------------------
>> -----
>> <pre>
>>
>> 1: Butter
>> The fatty portion of milk, separated as a soft yellowish solid when
>> milk or
>> cream is churned. It is processed for cooking and table use. (Random
>> House
>> Unabridged Dictionary, 2d ed)</pre>
>> ----------------------------------------------------------------------
>> --->8
>>
>>>
>>> The main problem is what to do with the result. We need to write a
>>> class
>>> that captures the returned information in a sensible way. Storing
>>> the id
>>> (D002079), heading ('Butter') and annotation ('a dairy product &
>>> dietary
>>> fat; MARGARINE is also available') is straight forward.
>>
>>> What is definition in this context?
>>
>> The scope note.
>>
>>
>>> All terms can belong to many trees. How should that information be
>>> captured?
>>>
>>> How wold you like to query MeSH data in addition to ID? Term? Tree
>>> number?
>>
>> Actually I just want to query MeSH in addition to ID for getting the
>> scope
>> note. But I think Bioperl should provide searching ability according
>> to MeSH
>> ID or keywords in annotation & scope note.
>>
>> Kindly regards,
>>
>> Christophe
>>
>> PS: I'm not very keen on butter but it's the first example that
>> appears in
>> my mind...
>>
>>>
>>>
>>> -Heikki
>>>
>>> On Fri, 2003-07-25 at 10:01, Christophe Bouvard wrote:
>>>> Hello,
>>>>
>>>> I am looking for perl module that retrieves information from MeSH
>> database.
>>>> I had a look in the bioperl documentation but I did not find out an
>>>> appropriate module.
>>>> So, is Bioperl providing this feature?
>>>> For instance, with an unique MeSH identier (such as D002079), how
>>>> can I
>> get
>>>> the MeSH heading, the definition and the annotations?
>>>> Thank you!
>>>>
>>>> Regards,
>>>>
>>>> Christophe
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at portal.open-bio.org
>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>> --
>>> ______ _/
>>> _/_____________________________________________________
>>> _/ _/ http://www.ebi.ac.uk/mutations/
>>> _/ _/ _/ Heikki Lehvaslaiho heikki_at_ebi ac uk
>>> _/_/_/_/_/ EMBL Outstation, European Bioinformatics Institute
>>> _/ _/ _/ Wellcome Trust Genome Campus, Hinxton
>>> _/ _/ _/ Cambs. CB10 1SD, United Kingdom
>>> _/ Phone: +44 (0)1223 494 644 FAX: +44 (0)1223 494 468
>>> ___
>>> _/_/_/_/_/________________________________________________________
> --
> ______ _/ _/_____________________________________________________
> _/ _/ http://www.ebi.ac.uk/mutations/
> _/ _/ _/ Heikki Lehvaslaiho heikki_at_ebi ac uk
> _/_/_/_/_/ EMBL Outstation, European Bioinformatics Institute
> _/ _/ _/ Wellcome Trust Genome Campus, Hinxton
> _/ _/ _/ Cambs. CB10 1SD, United Kingdom
> _/ Phone: +44 (0)1223 494 644 FAX: +44 (0)1223 494 468
> ___ _/_/_/_/_/________________________________________________________
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
--
-------------------------------------------------------------
Hilmar Lapp email: lapp at gnf.org
GNF, San Diego, Ca. 92121 phone: +1-858-812-1757
-------------------------------------------------------------
More information about the Bioperl-l
mailing list