[Biojava-l] Question about StructureTools and PDBFileReader

Andreas Prlic ap3 at sanger.ac.uk
Sat Jan 12 00:32:30 UTC 2008


Hi Martin,

I am not sure what you mean with "not counted". When I test
the PDB file you posted below it parses all the 3 amino acids 
into Groups, which is the intended behaviour.

PDBFileParser pdbpars = new PDBFileParser();
Structure structure = pdbpars.parsePDBFile(inStream) ;
System.out.println(structure);
Chain c = structure.getChainByPDB("A");
List<Group> groups = c.getAtomGroups();
for (Group g: groups){
 	System.out.println(g);
}
System.out.println("sequence: "  + c.getAtomSequence());

gives an output of:

structure  null DepDate: Thu Jan 01 01:00:00 GMT 1970 Resolution: 0.0 
ModDate: Thu Jan 01 01:00:00 GMT 1970  chains:
chain: >A<
  length SEQRES: 0 length ATOM: 3 aminos: 3 hetatms: 0 nucleotides: 0
DBRefs: 0
Molecules:

AminoAcid ATOM:GLN Q 27 true ATOMatoms: 7
AminoAcid ATOM:SER S 1027 true ATOMatoms: 6
AminoAcid ATOM:LEU L 2027 true ATOMatoms: 8
sequence: QSL

In case you would want to access the groups as SEQRES groups then 
these residues need to be specified in the the corresponding header line in the 
file. see also http://biojava.org/wiki/BioJava:CookBook:PDB:seqres

Does that help?

Andreas


--------------------------------------------------

Andreas Prlic      Wellcome Trust Sanger Institute
                    Hinxton, Cambridge CB10 1SA, UK



On Fri, 11 Jan 2008, Martin Heusel wrote:

> Hi,
>
> i read a PDB file with PDBFileReader.getStructure and want to extract
> the backbone of a chain with StructureTools. Now i have seen that for
> entries e.g.
>
> ATOM   3505  N   GLN A  27      32.144  27.054   0.696  1.00 47.70           N
> ATOM   3506  CA  GLN A  27      32.507  26.162  -0.401  1.00 42.73           C
> ATOM   3507  C   GLN A  27      31.388  26.137  -1.437  1.00 40.44           C
> ATOM   3508  O   GLN A  27      30.205  26.248  -1.096  1.00 41.47           O
> ATOM   3509  CB  GLN A  27      32.729  24.738   0.121  1.00 42.51           C
> ATOM   3510  CG  GLN A  27      34.124  24.449   0.611  1.00 39.02           C
> ATOM   3511  CD  GLN A  27      34.158  23.301   1.593  1.00 41.90           C
> ATOM   3512  OE1 GLN A  27      33.982  22.143   1.214  1.00 39.58           O
> ATOM   3513  NE2 GLN A  27      34.386  23.615   2.869  1.00 43.85           N
> ATOM   3514  N   SER A1027      31.762  25.988  -2.703  1.00 33.42           N
> ATOM   3515  CA  SER A1027      30.776  25.929  -3.769  1.00 31.11           C
> ATOM   3516  C   SER A1027      29.915  24.723  -3.462  1.00 27.99           C
> ATOM   3517  O   SER A1027      30.418  23.706  -2.991  1.00 29.25           O
> ATOM   3518  CB  SER A1027      31.449  25.746  -5.130  1.00 22.71           C
> ATOM   3519  OG  SER A1027      30.542  25.185  -6.056  1.00 28.95           O
> ATOM   3520  N   LEU A2027      28.619  24.838  -3.718  1.00 25.68           N
> ATOM   3521  CA  LEU A2027      27.714  23.743  -3.444  1.00 23.42           C
> ATOM   3522  C   LEU A2027      27.489  22.933  -4.694  1.00 24.27           C
> ATOM   3523  O   LEU A2027      26.750  21.950  -4.675  1.00 28.95           O
> ATOM   3524  CB  LEU A2027      26.391  24.278  -2.906  1.00 23.54           C
> ATOM   3525  CG  LEU A2027      26.547  25.090  -1.619  1.00 22.93           C
> ATOM   3526  CD1 LEU A2027      25.179  25.430  -1.056  1.00 26.74           C
> ATOM   3527  CD2 LEU A2027      27.361  24.285  -0.603  1.00 19.54           C
>
> the two residues SER and LEU are not counted. However, the fasta file
> from pdb.org website shows both residues for that chain. I wonder how
> the two entries A1027 and A2027 are interpreted by StructureTools.
>
> Thanks for any hints
>
> Martin
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biojava-l
>


-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 



More information about the Biojava-l mailing list