[EMBOSS] pdbparse format
Dr J.C. Ison
jison at hgmp.mrc.ac.uk
Tue Feb 1 13:02:13 UTC 2005
Hi Peter
The output of pdbparse is documented (the pages should be moving to
SourceForge very soon):
http://www.rfcgr.mrc.ac.uk/Software/EMBOSS/Apps/domainatrix/pdbparse.html
The format of the CO (coordinate line) *is* likely to change soon. At the
moment it includes both ATOM and RESIDUE-specific data. In a future release
these data will be separated on different line types (this is to keep a referee
of the pdbparse ms. happy).
Any changes will be documented and code for reading and writing the new
format will updated in AJAX (if there's a Java equivalent of "sscanf" I'd
use that rather than rely on column positions.)
You understand the format correctly. The 8.3 does look a bit odd but its
fine (would have been neater to use %-3S%9.3f instead of %-4S%8.3f but there
you go ... :)
Plus - if you have any suggestions / requirements for pdbparse, please let me
know.
Cheers
Jon
Thanks for the interest.
Jon
Peter Robinson wrote:
>
> Hi list,
>
> I have been looking into the new pdbparse program and am writing a
> routine to parse the output of that program into a Java program I am
> writing to do some datamining on PDB files. For the moment at least I am
> writing a parse routine based on column positions rather than on regular
> expressions or some other approach.
> I would appreciate some help in understanding the output format. I have
> been looking at the function ajPdbWriteAll.
>
> ajFmtPrintF(outf, "%-2c%6S %-4S%8.3f%9.3f%9.3f<snip>,
> tmp->Id1,
> tmp->Id3,
> tmp->Atm,
> tmp->X,
> tmp->Y,
> tmp->Z,
> <snip>
>
>
> this should print out the
> "%-2c": 1char AA (left-justified, 2 spaces)
> "%6S ": 3charAA (right-justified, 6 spaces, followed by 4 spaces)
> "-4S": AtomType (e.g., CA) (left-justified, 4 spaces)
> "%8.3" x coord, 8 spaces with 3 precision
> "%9.3" y coord 9 spaces with 3 precision
> "%9.3" z coord 9 spaces with 3 precision
>
> ==> If I am reading this correctly, why does the x coord have one less
> space than y and z?
> ==> Can one consider this output format fixed, or will it be subject to
> changes?
> ==> Excuse what may seem like a trivial question, but I don't want my
> program to break down "at an embarassing moment".
>
> Thanks,
> Peter
>
> --
> Peter N. Robinson
> peter.robinson at t-online.de
> peter.robinson at charite.de
> http://www.charite.de/ch/medgen/robinson/
--
Jon C. Ison, PhD
Proteomics Applications Group
MRC Rosalind Franklin Centre for Genomics Research
Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SB, UK
Tel: +44 1223 494500 Fax: +44 1223 494512
E-mail: jison at rfcgr.mrc.ac.uk Web: http://www.rfcgr.mrc.ac.uk
More information about the EMBOSS
mailing list