[Bioperl-l] to convert cDNA id of nucleotide database to gene acc.id of gene database of ncbi

Sean Davis sdavis2 at mail.nih.gov
Mon Nov 20 13:31:48 UTC 2006


On Monday 20 November 2006 07:45, bikash lohia wrote:
> hello group, I am new to this group and want a help.i have list of accesion
> id of rice (oryza sativa)such as AK070197   , AK105331 etc i have to
> manually search gene database of NCBI for converting this accession no. of
> cDNA(eg.AK070197) to gene  id of oryza sativa to get os******* gene id . i
> want to do it through perl programming where the program directly takes the
> list of id ( such as AK105331,Ak070197) from notepad file and searches in
> gene database of ncbi. to give results in accession id starting with
> OS****** .i want only the accession id of corresponding Ak***** id. for
> example -  AK070197 of nucleotide databse = Os02g0669100 of gene database.
> i want to convert all this AK***** ids to OS***** ids through programming
> in perl/bioperl  as manually not possible for long list. please help. i
> have no idea how can the code be. with thanks in advance from Bikash

There is some useful data at:

ftp://ftp.ncbi.nlm.nih.gov/gene/DATA

The README file contains details of each file.  The gene2accession.gz file 
contains genbank accession numbers and maps them back to Entrez Gene ID.  The 
gene_info.gz file contains the Entrez Gene summary information for all Entrez 
Genes.  With these two files (loaded into appropriate perl hashes), your task 
can be complete relatively easily.  

Alternatively, you could use the eUtils modules, which I think are available 
only via CVS.

Sean



More information about the Bioperl-l mailing list