Bioperl: UPDB version 0.3
Andrew Dalke
dalke@bioreason.com
Fri, 04 Sep 1998 06:28:57 -0700
Hello again,
If anyone is interested, I have released UPDB version 0.3 (the
PDB parser generater I was talking about last week). It is
temporarily located at:
ftp://ftp.ks.uiuc.edu/pub/group/dalke/UPDB-0.3.tar.gz
It includes a generated python parser for the PDB format
version "1" and "2.1" (which is essentially the same as "2.2")
The python distribution also includes a "master" parser that
auto-detects which version to use.
Excepting a known problem with some MASTER records (the problem
appears to be in the PDB files), this parser has successfully
translated every record in the "aa" directory of the PDB from a
line of text to a data structure and back to a line of text.
I've spot verified the parsed data and not found problems.
UPDB includes a generated perl parser for the two versions, but
they aren't in a module and don't have the "master" parser. It
also does not convert some fields to an appropriate data type
(eg, REMARK 2 "resolution" is a "Real(5.2)" that conserves the
number of significant digits, so 3.00 != 3.0).
Someone has offered to help modernize the perl code.
UPDB includes format descriptions for the UCSF "USER"
extensions and the Raster3D "COLOUR" extension but I have
not tested them.
EXAMPLES:
======= Python: =============
from UPDB.Parser import Parser
import fileinput
p = Parser()
x = 0.0; count = 0
for line in fileinput.input(["test.pdb"]):
rec = p.unpack(line)
if rec['type'] == "ATOM " or rec['type'] == "HETATOM":
x = x + rec['x']
count = count + 1
print "Average 'x' value =", x/count
==========================
which prints:
> Average 'x' value = 26.4187435897
======= Perl: =============
require "UPDB/Version2_1.pl";
open(INFILE, "<test.pdb");
$x = 0.0;
$count = 0;
while (<INFILE>) {
$rec = &pdb_unpack($_);
if ($rec->{'type'} eq "ATOM " || $rec->{'type'} eq "HETATM") {
$x += $rec->{'x'};
$count ++;
}
}
print "Average 'x' value = ", $x/$count, "\n"
==========================
which prints:
> Average 'x' value = 26.4187435897436
UPDB is written in Python and is distributed under the GNU
General Public License, but the output of the program is not
under the GPL and can be included in other software without
affecting other types of licenses.
Please try it out. I would be interested in hearing
feedback, but I am on vacation from essentially now until
the 14th so I won't be able to respond for a while.
Andrew Dalke
dalke@bioreason.com
=========== Bioperl Project Mailing List Message Footer =======
Project URL: http://bio.perl.org/
For info about how to (un)subscribe, where messages are archived, etc:
http://www.techfak.uni-bielefeld.de/bcd/Perl/Bio/vsns-bcd-perl.html
====================================================================