[Bioperl-l] from protein to nucleotide

Jack Chen chenn at cshl.edu
Fri Jun 13 10:52:57 EDT 2003


Here is another question about GenBank sequence object. I am wondering
whether there is a convenient way to retrieve the nucleotide sequence of a
protein with known gi number. For example, for protein
gi|497063|gb|AAB60473.1|, how should I get its corresponding nucleotide?
Manually, I can get it by visiting the NCBI page and follow the links. But
is there a way to do this automatically?

Also, does anyone know how to parse the GenPept sequence object to get the
'DBSOURCE' field? For example, how can I get the accession number
'U05729.1' from the following record? Thanks!

LOCUS       AAB60473                  39 aa            linear   ROD
DEFINITION  preproinsulin I.
VERSION     AAB60473.1  GI:497063
DBSOURCE    locus MSU05729 accession U05729.1
SOURCE      Mus spretus (western wild mouse)
  ORGANISM  Mus spretus
            Eukaryota; Metazoa; Chordata; Craniata; Vertebrata;
            Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae;


|    o-o     Jack Chen, Stein Laboratory       |
|    o---o   Cold Spring Harbor Laboratory     |
|  o----o    1 Bungtown Road                   |
| O----O     Cold Spring Harbor, NY, 11724     |
| 0--o       Tel: 1 516 367 8394               |
|   O        Website: http://www.wormbase.org  |
|  o-o       e-mail: chenn at cshl.org            |

