[Bioperl-l] PubMed records (was: MeSH terms)

Sat Oct 24 18:45:01 UTC 2009

 <alsaplayer-devel at lists.tartarus.org>
I'm not sure if this is related to the MeSH question question or not, but
I've googled the documentation several times and never managed to find
"robust" examples for how to manipulate PubMed records.

It would seem that there ought to be code lying around which does:
  Given Genbank ID,
     Fetch all Pubmed records from that ID
         Fetch all related records (via NCBI's "related" record IDs)

     Purge the list of duplicates, then do things like fetch all of the
abstracts or fetch all of the MeSH headings, etc. for all of those records.

Another example would include fetching all records of relatedness (i.e. a
PubMed tree of depth N (or cloud of some max N)).

I think that one can use NCBI's fetch interface to do this (one could do it
by having NCBI email you all of the PubMed results and have an email
harvester collect those results, parse them and setup a new set of
queries).  Of course this seems like an overhead intensive way to do this.
Given the fact that increasing amounts of information is becoming open to
the public one could consider even parsing the published papers and
supplemental files (e.g. XLS tables) for genes of interest (as it seems the
authors of most work as well as the PubMed record processors fail to provide
or research the gene name information that is supposed to be in the PubMed
records).

Now it may simply be that its because I lack sufficient experience with the
BioPerl documentation that I am unaware of the functions/tools which do this
type of thing.  So if anyone has any hints/pointers they would be
appreciated.

Thanks,
Robert Bradbury