[Biopython] Problem with pdb-file parsing
Christian Schäfer
schafer at rostlab.org
Tue Sep 8 17:45:53 UTC 2009
Hi,
I don't know whether this is either a bug or I did something wrong. I am
parsing the pdb structure 1a2d with the following code to get the
one-letter polypeptide sequence for chain A:
------------------CODE----------------
from Bio.PDB.PDBParser import PDBParser
from Bio.PDB.Polypeptide import *
parser = PDBParser()
ppb = PPBuilder()
structure = parser.get_structure('tmp', '1a2d.pdb')
polypeptide = ppb.build_peptides(structure[0]['A'])
sequence = str(polypeptide[0].get_sequence())
print sequence
------------------CODE----------------
This however gives me a sequence that is one aminoacid shorter than
expected. The structure contains one HETATM block within the ATOM block
of chain A (pos 117), which gets translated into a 'X' in the sequence.
The following aminoacid at position 118 (VAL) seems to be missing.
So the resulting sequence around the X is:
...VEXMK...
To my understanding this should be:
...VEXVMK...
Is this behaviour intended? Is it a bug? The biopython version is 1.49
(Ubuntu jaunty).
Chris
More information about the Biopython
mailing list