[Bioperl-l] Sequence IDs

Robson Francisco de Souza rfsouza at citri.iq.usp.br
Thu Mar 25 20:48:51 EST 2004


This is not quite a bioperl question, but maybe there is some useful
code in bioperl that can easy the task below.

I'm trying to automatically track GI changes for a sequence in Genbank,
so that I can list all GI numbers associated to a gene in a complete
genome from Genbank, EMBL or Kegg. I need this to compare results from
Kegg Orthologs database (GenomeNet), COG database (NCBI) and the String
server from EMBL.
Although I was able to download Kegg release 29, GI numbers for genes in
this release are outdated and do not agree with COG GIs. Does anybody
know of a way to retrive all GIs (PID) for every genes in complete genomes?

Also, what is the unique ID or accession key for protein/dna
sequence that is used by all main sequence databases (Genbank,
SP/TrEMBL, DDBJ), if there is any? I couldn't find an easy way to cross 
results from analysis that employ identifications from different

Thanks for any help.

More information about the Bioperl-l mailing list