MeSH terms in bioperl RE: [Bioperl-l] (no subject)

Heikki Lehvaslaiho heikki at ebi.ac.uk
Mon Jul 28 07:11:00 EDT 2003


Christophe,

OK, the preliminary modules are now in the main trunk.

I decided to use the NLM server as it gives a richer selection of
options. The implementation allows you to retrieve entries using UID or
exact terms, only.

If there is interest, it would be easy enough to extend it to find terms
based on substrings, which might be useful. An other useful extension
would be to parse synonyms ('Entry Term') from the output.

Since a term can appear in more than one position in the MeSH tree
hierarchy, a retrieved term has one or more 'twigs' listing parent,
sister and child terms that define its role. Any of these other term
strings can be then used to navigate the tree.


 use Bio::DB::MeSH;
 my $meshdb = new Bio::DB::MeSH();
 # fetch a Bio::Phenotype::MeSH object
 my $term=$meshdb->get_exact_term('Down Syndrome');
 print $term->description, "\n";

 my @parents = $term->each_parent();  # array of term strings
 my @roles = $term->each_twig();  
 # array of Bio::Phenotype::MeSH::Twig objects


Enjoy,

	-Heikki






On Fri, 2003-07-25 at 12:11, Christophe Bouvard wrote:
> > Christophe,
> >
> > There is nothing in Bioperl that could do it.
> >
> > However, adding it is not too difficult. Emulating the MeSH browser page
> > at http://www.nlm.nih.gov/mesh/MBrowser.html is easy.
> >
> > Is this the only public server?
> 
> You can access to MeSH either with NCBI's Entrez or with the MeSH browser.
> 
> For the butter :
> http://www.nlm.nih.gov/cgi/mesh/2003/MB_cgi?field=uid&term=D002079
> http://www.ncbi.nih.gov/entrez/query.fcgi?cmd=Text&db=mesh&uid=68002079
> 
> 
> So I can write a Perl program that reads and parses the page located at the
> URL
> http://www.nlm.nih.gov/cgi/mesh/2003/MB_cgi?field=uid&term=XXXXXXX (where
> XXXXXXX is the unique MeSH ID).
> For instance, let's have a look to the source of the page
> http://www.nlm.nih.gov/cgi/mesh/2003/MB_cgi?field=uid&term=D002079 :
> 
> 8<-------------------------------------------------------------------------
> [...]
> <TR><TH align=left>MeSH Heading</TH><TD>Butter</TD></TR>
> <TR><TH align=left>Tree Number</TH><TD><A
> HREF="#TreeD10.516.212.302.199">D10.516.212.302.199</A></TD></TR>
> <TR><TH align=left>Tree Number</TH><TD><A
> HREF="#TreeJ02.500.350.100">J02.500.350.100</A></TD></TR>
> <TR><TH align=left>Tree Number</TH><TD><A
> HREF="#TreeJ02.500.375.200">J02.500.375.200</A></TD></TR>
> <TR><TH align=left>Annotation</TH><TD>a dairy product & dietary fat; <A
> href="/cgi/mesh/2003/MB_cgi?term=MARGARINE">
> MARGARINE</A> is also available</TD></TR>
> <TR><TH align=left>Scope Note</TH><TD>The fatty portion of milk, separated
> as a soft yellowish solid when milk or cream is churned. It is processed for
> cooking and table use. (Random House Unabridged Dictionary, 2d ed)</TD></TR>
> [...]
> <TR><TH align=left>CAS Type 1 Name</TH><TD>Butter</TD></TR>
> <TR><TH align=left>Registry Number</TH><TD>8029-34-3</TD></TR>
> <TR><TH align=left>Unique ID</TH><TD>D002079</TD></TR>
> [...]
> ------------------------------------------------------------------------->8
> 
> It is easy to parse this kind of simple HTML page.
> 
> Or I can write a program that reads and parses the NCBI's Entrez page
> located at the address
> http://www.ncbi.nih.gov/entrez/query.fcgi?cmd=Text&db=mesh&uid=XXXXXXX
> (where XXXXXXX is the unique NCBI ID and NOT the unique MeSH ID).
> For example, this is the source of the page
> http://www.ncbi.nih.gov/entrez/query.fcgi?cmd=Text&db=mesh&uid=68002079
> (Butter):
> 
> 8<-------------------------------------------------------------------------
> <pre>
> 
> 1: Butter
> The fatty portion of milk, separated as a soft yellowish solid when milk or
> cream is churned. It is processed for cooking and table use. (Random House
> Unabridged Dictionary, 2d ed)</pre>
> ------------------------------------------------------------------------->8
> 
> >
> > The main problem is what to do with the result. We need to write a class
> > that captures the returned information in a sensible way. Storing the id
> > (D002079), heading ('Butter') and annotation ('a dairy product & dietary
> > fat; MARGARINE is also available') is straight forward.
> 
> > What is definition in this context?
> 
> The scope note.
> 
> 
> > All terms can belong to many trees. How should that information be
> > captured?
> >
> > How wold you like to query MeSH data in addition to ID? Term? Tree
> > number?
> 
> Actually I just want to query MeSH in addition to ID for getting the scope
> note. But I think Bioperl should provide searching ability according to MeSH
> ID or keywords in annotation & scope note.
> 
> Kindly regards,
> 
> Christophe
> 
> PS: I'm not very keen on butter but it's the first example that appears in
> my mind...
> 
> >
> >
> > 	-Heikki
> >
> > On Fri, 2003-07-25 at 10:01, Christophe Bouvard wrote:
> > > Hello,
> > >
> > > I am looking for perl module that retrieves information from MeSH
> database.
> > > I had a look in the bioperl documentation but I did not find out an
> > > appropriate module.
> > > So, is Bioperl providing this feature?
> > > For instance, with an unique MeSH identier (such as D002079), how can I
> get
> > > the MeSH heading, the definition and the annotations?
> > > Thank you!
> > >
> > > Regards,
> > >
> > > Christophe
> > >
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l at portal.open-bio.org
> > > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> > --
> > ______ _/      _/_____________________________________________________
> >       _/      _/                      http://www.ebi.ac.uk/mutations/
> >      _/  _/  _/  Heikki Lehvaslaiho    heikki_at_ebi ac uk
> >     _/_/_/_/_/  EMBL Outstation, European Bioinformatics Institute
> >    _/  _/  _/  Wellcome Trust Genome Campus, Hinxton
> >   _/  _/  _/  Cambs. CB10 1SD, United Kingdom
> >      _/      Phone: +44 (0)1223 494 644   FAX: +44 (0)1223 494 468
> > ___ _/_/_/_/_/________________________________________________________
-- 
______ _/      _/_____________________________________________________
      _/      _/                      http://www.ebi.ac.uk/mutations/
     _/  _/  _/  Heikki Lehvaslaiho    heikki_at_ebi ac uk
    _/_/_/_/_/  EMBL Outstation, European Bioinformatics Institute
   _/  _/  _/  Wellcome Trust Genome Campus, Hinxton
  _/  _/  _/  Cambs. CB10 1SD, United Kingdom
     _/      Phone: +44 (0)1223 494 644   FAX: +44 (0)1223 494 468
___ _/_/_/_/_/________________________________________________________



More information about the Bioperl-l mailing list