[BioRuby] Batch Entrez search

Marc Hoeppner marc.hoeppner at molbio.su.se
Mon Aug 24 05:36:48 UTC 2009


Hi,

I suppose for FlyBase genes you could also use the ruby Ensembl API.

Something like:

require 'ensembl'


Ensembl::Core::DBConnection('drosophila_melanogaster','55)

IO.foreach('my_infile') do |flybase_id|

    gene = Ensembl::Core::Gene.find_by_stable_id(flybase_id)
   
    gene.all_xrefs.each do |xref|
      
        puts xref
   
    end

end

Well, you get the idea. The methods are well documented in the 
corresponding API, but when in doubt I can offer some help, too.

P.S.: To make it real easy you could also use the BioMart on 
www.ensembl.org - unless you need this to be a script.

Cheers,

Marc
> Hi,
>
>> Is there a way to batch search with BioRuby for a whole bunch of Flybase
>> IDs?
>
>
> According to
> http://www.ncbi.nlm.nih.gov/bookshelf/br.fcgi?book=helpgene&part=genefaq
> the data for entrez gene are available in
> ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene_info.gz
>
> If you need really a lot, perhaps its better to download that file 
> (about 93 Mbytes).
> (It contains data for other organisms, which is not necessary, but 
> does not take forever)
> The format seems simple enough that you can easily get the gene ID for 
> a flybase ID.
>
> #Format: tax_id GeneID Symbol LocusTag Synonyms dbXrefs chromosome 
> map_location description type_of_gene 
> Symbol_from_nomenclature_authority Full_name_from_nomenclatur
> e_authority Nomenclature_status Other_designations Modification_date 
> (tab is used as a separator, pound sign - start of a comment)
>


-- 

Marc P. Hoeppner
PhD student
Department of Molecular Biology and Functional Genomics
Stockholm University, 10691 Stockholm, Sweden

marc.hoeppner at molbio.su.se
Tel: +46 (0)8 - 164195




More information about the BioRuby mailing list