[Biojava-dev] Getting Protein 3D structure PDF file from Entrez RefSeq protein ID

Ambikesh Jayal ambi1999 at gmail.com
Mon Nov 22 17:24:16 UTC 2010


Hi,

Sub: Getting Protein 3D structure PDF file from Entrez RefSeq protein ID

1. I have the Entrez Gene Id and RefSeq protein ID of some proteins and I
would like to get their 3D tertiary structure in the form of a PDB file. Any
suggestion as to how can I get it?

2. As an example can you please send me the PDB structure for the protein
with RefSeq protein ID as NP_004788 ( gene id 9370)?

3. Also should the PDB structure contain the exact sequence of amino acids
as contained in the protein that I want to search? My understanding is that
it should. I understand that a single PDB structure can have more than one
protein and so my guess is that there should be atleast one chain in the PDB
structure that matches exactly with the protein I am search Am I right?

4. My approach to get PDB structure from RefSeq protein ID is as follows.
  a) Search with Refseq protein ID in entrez and get the linear sequence of
amino acid for the protein.
  b) Search with the Refseq protein ID  in Uniprot. Generally UniPort has
the 3D structure for a protein which links to SWISS-MODEL Repository. The
 SWISS-MODEL Repository allows to download the structure in PDB format and
also has the has the RCSB PDB ID. But it has Sequence identity value which
is less than 100%. For example for RefSeq protein ID "NP_004788", UniPort ID
Q15848, SWISS-MODEL Repository 3D Structure based on template "1c3h", the
sequence similarity is 91%. How is this value being derived?

Many thanks.
Ambi.



More information about the biojava-dev mailing list