[Bioperl-l] dssp script
Robinson, Peter
peter.robinson at charite.de
Fri Jan 21 08:30:14 EST 2005
Dear BioPerlers,
I am writing a script to use the BioPerl DSSP module to print out a list of phi and psi angles for all applicable residues of all chains. Although the results are correct, I get the following error message at the end of each chain:
Argument "" isn't numeric in numeric eq (==) at /usr/local/share/perl/5.8.4/Bio/Structure/SecStr/DSSP/Res.pm line 1168.
and I am not quite sure where it is coming from. Perhaps I am using the wrong part of the API, but I am trying to get a list of all residues for each chain as follows:
foreach my $ch (@chains) {
my $ss_elements_pts = $dssp->secBounds($ch);
print "Chain $ch:\n";
my $pos = 0;
my $max = 0;
foreach my $stretch (@{$ss_elements_pts}) {
my $start = $stretch->[0];
my $end = $stretch->[1];
if ($end =~ m/(\d+)/) { $end = $1; }
if ($end > $max) { $max = $end; }
}
## END is now the last residue in this chain
for my $res (1..$max) {
my $residueID = $res . ":" . $ch;
my ($phi,$psi,$SS,$SSsum,$AA);
eval { $phi = $dssp->resPhi($residueID);};
etc.
The full script is appended to the bottom of this mail.
I also noticed what might be a minor bug in the module DSSP/Res.pm; when I use dsspcmbi to analyze a PDB file, it produces a results file with an empty last line. This causes a crash:
Use of uninitialized value in chomp at /usr/local/share/perl/5.8.4/Bio/Structure/SecStr/DSSP/Res.pm line 1284, <GEN1> line 955.
If I manually remove this last empty line, there was no error. By adding the following line at Res.pm l.1284, you can fix the problem:
while ( chomp( $cur = <$file> ) ) {
next if ($cur =~ m/^\s*$/); *********************************************
$res_num = substr( $cur, 0, 5 );
$res_num =~ s/\s//g;
$self->{ 'Res' }->[ $res_num ] = &_parseResLine( $cur );
}
}
Thanks in adavance for any tips! Peter
Peter N. Robinson, M.D.
Institute of Medical Genetics
Charité University Hospital
Augustenburger Platz 1
13353 Berlin
Germany
++49-30-450 569124
peter.robinson at charite.de
http://www.charite.de/ch/medgen/robinson
Beware of bugs in the above code; I have only proved it correct, not tried it. -Donald Knuth, computer scientist (1938- )
########################
#!/usr/bin/perl -w
use IO::File;
use Bio::Structure::SecStr::DSSP::Res;
use Data::Dumper;
=pod
parseDSSP.pl
Script to parse the output of DSSP using the BioPerl module
Bio::Structure::SecStr::DSSP::Res. To use it, process a PDB
file with dssp or dsspcmbi, and pass the resulting file to
this script. For more information on dssp and BioPerl see the
module documentation at http://bioperl.org
@email peter.robinson at charite.de
21 January, 2005
=cut
my $file = "pdb43ca.dssp";
my $dssp = new Bio::Structure::SecStr::DSSP::Res('-file'=> "$file");
my $pdbID = $dssp->pdbID();
my $auth = $dssp->pdbAuthor();
my $cmpd = $dssp->pdbCompound();
my $pdb_date = $dssp->pdbDate();
my $header = $dssp->pdbHeader();
my $pdbSource = $dssp->pdbSource();
print "PDB entry $pdbID \n\tauthor:\t$auth",
"\n\tCompound:\t$cmpd",
"\n\tDate:\t$pdb_date",
"\n\tHeader:\t$header",
"\n\tsource:\t$pdbSource\n\n";
my $totalRes = $dssp->numResidues();
print "Total residue count (all chains):$totalRes\n";
my $surArea= $dssp->totSurfArea();
print "Total accessible surface area:\t$surArea (square Ang)\n";
my $chainRef = $dssp->chains();
my @chains = sort @{$chainRef};
print "Chain[s]:\n";
foreach my $ch (@chains) {
print "\t$ch";
}
print "\n";
my $hb = $dssp->hBonds();
print "H BONDS.\n";
print "TYPE O(I)-->H-N(J): $hb->[0]\n",
"IN PARALLEL BRIDGES: $hb->[1]\n",
"IN ANTIPARALLEL BRIDGES $hb->[2]\n",
"TYPE O(I)-->H-N(I-5) $hb->[3]\n",
"TYPE O(I)-->H-N(I-4) $hb->[4]\n",
"TYPE O(I)-->H-N(I-3) $hb->[5]\n",
"TYPE O(I)-->H-N(I-2) $hb->[6]\n",
"TYPE O(I)-->H-N(I-1) $hb->[7]\n",
"TYPE O(I)-->H-N(I+0) $hb->[8]\n",
"TYPE O(I)-->H-N(I+1) $hb->[9]\n",
"TYPE O(I)-->H-N(I+2) $hb->[10]\n",
"TYPE O(I)-->H-N(I+3) $hb->[11]\n",
"TYPE O(I)-->H-N(I+4) $hb->[12]\n",
"TYPE O(I)-->H-N(I+5) $hb->[13]\n",
"\n";
foreach my $ch (@chains) {
my $ss_elements_pts = $dssp->secBounds($ch);
print "Chain $ch:\n";
my $pos = 0;
my $max = 0;
foreach my $stretch (@{$ss_elements_pts}) {
my $start = $stretch->[0];
my $end = $stretch->[1];
if ($end =~ m/(\d+)/) { $end = $1; }
if ($end > $max) { $max = $end; }
}
## END is now the last residue in this chain
for my $res (1..$max) {
my $residueID = $res . ":" . $ch;
my ($phi,$psi,$SS,$SSsum,$AA);
eval { $phi = $dssp->resPhi($residueID);};
eval { $psi = $dssp->resPsi($residueID);};
eval { $SS = $dssp->resSecStr($residueID);};
eval { $SSsum = $dssp->resSecStrSum($residueID);};
$AA = $dssp->resAA($residueID);
$phi = $phi || "n/a";
$psi = $psi || "n/a";
$SS = $SS || "-";
my $SSclass;
if ($SSsum eq "H") { $SSclass = "helix"; }
elsif ($SSsum eq "T") { $SSclass = "turn"; }
elsif ($SSsum eq "B") { $SSclass = "beta"; }
else { $SSclass = $SSsum; }
print "$residueID) [$AA] phi:$phi psi:$psi SecStruct: $SS ($SSclass) \n";
}
}
More information about the Bioperl-l
mailing list