[EMBOSS] pdbparse format

Dr J.C. Ison jison at hgmp.mrc.ac.uk
Tue Feb 1 13:02:13 UTC 2005


Hi Peter

The output of pdbparse is documented (the pages should be moving to 
SourceForge very soon):
http://www.rfcgr.mrc.ac.uk/Software/EMBOSS/Apps/domainatrix/pdbparse.html

The format of the CO (coordinate line) *is* likely to change soon.  At the 
moment it includes both ATOM and RESIDUE-specific data.  In a future release 
these data will be separated on different line types (this is to keep a referee
of the pdbparse ms. happy).

Any changes will be documented and code for reading and writing the new 
format will updated in AJAX (if there's a Java equivalent of "sscanf" I'd 
use that rather than rely on column positions.)

You understand the format correctly.  The 8.3 does look a bit odd but its 
fine (would have been neater to use %-3S%9.3f instead of %-4S%8.3f but there 
you go ... :)

Plus - if you have any suggestions / requirements for pdbparse, please let me 
know.

Cheers

Jon




Thanks for the interest.

Jon

Peter Robinson wrote:
> 
> Hi list,
> 
> I have been looking into the new pdbparse program and am writing a
> routine to parse the output of that program into a Java program I am
> writing to do some datamining on PDB files. For the moment at least I am
> writing a parse routine based on column positions rather than on regular
> expressions or some other approach.
> I would appreciate some help in understanding the output format. I have
> been looking at the function ajPdbWriteAll.
> 
> ajFmtPrintF(outf, "%-2c%6S    %-4S%8.3f%9.3f%9.3f<snip>,
>                                     tmp->Id1,
>                                     tmp->Id3,
>                                     tmp->Atm,
>                                     tmp->X,
>                                     tmp->Y,
>                                     tmp->Z,
>                                     <snip>
> 
> 
> this should print out the
> "%-2c":  1char AA (left-justified, 2 spaces)
> "%6S    ": 3charAA (right-justified, 6 spaces, followed by 4 spaces)
> "-4S": AtomType (e.g., CA) (left-justified, 4 spaces)
> "%8.3" x coord, 8 spaces with 3 precision
> "%9.3" y coord  9 spaces with 3 precision
> "%9.3" z coord  9 spaces with 3 precision
> 
> ==> If I am reading this correctly, why does the x coord have one less
> space than y and z?
> ==> Can one consider this output format fixed, or will it be subject to
> changes?
> ==> Excuse what may seem like a trivial question, but I don't want my
> program to break down "at an embarassing moment".
> 
> Thanks,
> Peter
> 
> --
> Peter N. Robinson
> peter.robinson at t-online.de
> peter.robinson at charite.de
> http://www.charite.de/ch/medgen/robinson/


-- 
Jon C. Ison, PhD
Proteomics Applications Group
MRC Rosalind Franklin Centre for Genomics Research
Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SB, UK
Tel: +44 1223 494500  Fax: +44 1223 494512
E-mail: jison at rfcgr.mrc.ac.uk  Web: http://www.rfcgr.mrc.ac.uk



More information about the EMBOSS mailing list