[BioRuby] count parameter in Bio::PubMed.esearch

Toshiaki Katayama ktym at hgc.jp
Sat Nov 10 08:40:09 UTC 2007


Hi Kaustubh,

Thank you for your suggestion. I applied your changes to the CVS.

During this process, I found that the previous fix applied by Jan was wrong.
Developers, please do the test before you commit your changes. :)

The change should be made to the Bio::PubMed.query method,
however, the search method is also needed to be rewritten
because the HTML structure returned by NCBI was reformatted.

Anyway, in Bio::PubMed module, use of the esearch/efetch methods pair is
strongly recommended compared to the search/query methods pair.


bioruby> Bio::PubMed.search("(genome AND analysis) OR bioinformatics)")
  ==> ["17989981", "17989975", "17989954", "17989953", "17989781", "17989717", "17989252", "17989247", "17989233", "17989226", "17989095", "17989092", "17989061", "17989054", "17988782", "17988704", "17988577", "17988401", "17988398", "17988368"]

bioruby> Bio::PubMed.esearch("(genome AND analysis) OR bioinformatics)")
  ==> ["17989981", "17989975", "17989954", "17989953", "17989781", "17989717", "17989252", "17989247", "17989233", "17989226", "17989095", "17989092", "17989061", "17989054", "17988782", "17988704", "17988577", "17988401", "17988398", "17988368", "17988176", "17988086", "17987666", "17987374", "17987257", "17987048", "17986781", "17986522", "17986471", "17986460", "17986440", "17986356", "17986355", "17986329", "17986320", "17986282", "17986185", "17986079", "17985162", "17984568", "17984549", "17984548", "17984520", "17984228", "17984226", "17984208", "17984205", "17984085", "17984084", "17984080", "17983847", "17983807", "17983802", "17983573", "17983493", "17983269", "17983268", "17983157", "17982457", "17982456", "17982442", "17982427", "17982176", "17982123", "17981990", "17981981", "17981974", "17981891", "17981844", "17981816", "17981801", "17981746", "17981579", "17981546", "17981477", "17981060", "17981052", "17980519", "17980517", "17980477", "17980146", "17980047", "17980028", "17980019", "17979886", "17979725", "17979297", "17979181", "17978887", "17978880", "17978572", "17978498", "17978310", "17978184", "17978179", "17977886", "17977881", "17977850", "17977831", "17977670"]

bioruby> Bio::PubMed.esearch("(genome AND analysis) OR bioinformatics)", {'rettype' => 'count'})
  ==> 286139

Regards,
Toshiaki Katayama


On 2007/11/06, at 19:00, Kaustubh Patil wrote:

> Hi,
>
> Here is a suggestion/feature for Bio::PubMed.esearch.
>
> Currently it is not possible to use rettype=count (through options hash) in Bio::PubMed.esearch.
>
> To get this feature replace the following line in pubmed.rb (approx. line 97)
>
> result = result.scan(/<Id>(.*?)<\/Id>/m).flatten
>
> by
>
> if(hash['rettype']=="count")
>        result = result.scan(/<Count>(.*?)<\/Count>/m).flatten
>        result = result[0]
> else
>        result = result.scan(/<Id>(.*?)<\/Id>/m).flatten
> end
>
>
> and it will return the count as a string, which can be easily converted to an integer by "result.to_i"
>
> I hope it is useful.
>
> Cheers,
> Kaustubh Patil
>
> PS: for more details on Entrez esearch parameters, please refer to;
>
> http://www.ncbi.nlm.nih.gov/entrez/query/static/esearch_help.html
> _______________________________________________
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby





More information about the BioRuby mailing list