WG: [Bioperl-l] dssp script
Robinson, Peter
peter.robinson at charite.de
Mon Feb 14 05:30:16 EST 2005
Thanks Ed,
the iterator seems to work perfectly. The module is quite useful for me. It may be that I am doing something wrong, but it seems as if there may be one remaining bug. If you look at line 1206 $key_num = $arg_str;
For some files it seems as if $key_num gets assigned a string such as "110:" rather than the integer 110. This messes us line 1215
if ($key_num == $self->{'Res'}->[$i]-{'pdb_resnum'})
I added
$key_num =~ s/://;
after line 1206, which fixed the problem.
Thanks again, peter
Peter N. Robinson, M.D.
Institute of Medical Genetics
Charité University Hospital
Augustenburger Platz 1
13353 Berlin
Germany
++49-30-450 569124
peter.robinson at charite.de
http://www.charite.de/ch/medgen/robinson
The best editor for MS Word documents: /dev/null
-----Ursprüngliche Nachricht-----
Von: Ed Green [mailto:ed at compbio.berkeley.edu]
Gesendet: Dienstag, 1. Februar 2005 00:08
An: ed at compbio.berkeley.edu
Cc: Robinson, Peter; bioperl-l at portal.open-bio.org
Betreff: Re: [Bioperl-l] dssp script
Peter-
Just checked in these fixes to Bio::Structure::SecStr::DSSP:Res.pm
You may find the new residues() interator method useful.
Regards,
Ed Green
Ed Green wrote:
> Dear Peter,
>
> These two are in fact bugs that I will fix. The first results because of
> the presence of 'termination residues' that don't have residue numbers.
> Their residue numbers, then, can't be compared numerically. Fortunately,
> this bug won't result in wrong results as we want this comparison to
> always be false anyway. The solution to this is to first check if either
> of the termination residue signals are set and if so, don't do this
> numerical comparison.
>
> The second, blank line(s) at end of file will also be fixed.
>
> Beware that there is, I think, a bug in your script. It appears that you
> are attempting to iterate over all residues. However, iterating A:1 ..
> A:max doesn't get it done because of the crazy way residues can be
> numbered in PDB files: you'll miss all the residues with altloc codes
> (A:27A, A:27B, A:27C, e.g.).
>
> To make this easy an iterator is called for. It will just return all
> 'real' residues for the pdb file or for a specified chain - I'll try to
> get that done this weekend.
>
> Regards,
> Ed Green
>
> Robinson, Peter wrote:
>
>> Dear BioPerlers,
>>
>> I am writing a script to use the BioPerl DSSP module to print out a
>> list of phi and psi angles for all applicable residues of all chains.
>> Although the results are correct, I get the following error message at
>> the end of each chain:
>>
>> Argument "" isn't numeric in numeric eq (==) at
>> /usr/local/share/perl/5.8.4/Bio/Structure/SecStr/DSSP/Res.pm line 1168.
>>
>> and I am not quite sure where it is coming from. Perhaps I am using
>> the wrong part of the API, but I am trying to get a list of all
>> residues for each chain as follows:
>>
>> foreach my $ch (@chains) {
>> my $ss_elements_pts = $dssp->secBounds($ch);
>> print "Chain $ch:\n";
>> my $pos = 0;
>> my $max = 0;
>> foreach my $stretch (@{$ss_elements_pts}) {
>> my $start = $stretch->[0];
>> my $end = $stretch->[1]; if ($end =~ m/(\d+)/) { $end = $1; }
>> if ($end > $max) { $max = $end; }
>> }
>> ## END is now the last residue in this chain
>> for my $res (1..$max) {
>> my $residueID = $res . ":" . $ch;
>> my ($phi,$psi,$SS,$SSsum,$AA);
>> eval { $phi = $dssp->resPhi($residueID);};
>> etc.
>>
>> The full script is appended to the bottom of this mail.
>>
>>
>> I also noticed what might be a minor bug in the module DSSP/Res.pm;
>> when I use dsspcmbi to analyze a PDB file, it produces a results file
>> with an empty last line. This causes a crash:
>>
>> Use of uninitialized value in chomp at
>> /usr/local/share/perl/5.8.4/Bio/Structure/SecStr/DSSP/Res.pm line
>> 1284, <GEN1> line 955.
>>
>>
>> If I manually remove this last empty line, there was no error. By
>> adding the following line at Res.pm l.1284, you can fix the problem:
>>
>>
>> while ( chomp( $cur = <$file> ) ) {
>> next if ($cur =~ m/^\s*$/);
>> *********************************************
>> $res_num = substr( $cur, 0, 5 );
>> $res_num =~ s/\s//g;
>> $self->{ 'Res' }->[ $res_num ] = &_parseResLine( $cur );
>> }
>> }
>>
>>
>>
>>
>> Thanks in adavance for any tips! Peter
>> Peter N. Robinson, M.D.
>> Institute of Medical Genetics
>> Charité University Hospital
>> Augustenburger Platz 1
>> 13353 Berlin
>> Germany
>> ++49-30-450 569124
>> peter.robinson at charite.de
>> http://www.charite.de/ch/medgen/robinson
>> Beware of bugs in the above code; I have only proved it correct, not
>> tried it. -Donald Knuth, computer scientist (1938- )
>>
>> ########################
>>
>> #!/usr/bin/perl -w
>> use IO::File;
>> use Bio::Structure::SecStr::DSSP::Res;
>> use Data::Dumper;
>>
>>
>> =pod
>> parseDSSP.pl
>> Script to parse the output of DSSP using the BioPerl module
>> Bio::Structure::SecStr::DSSP::Res. To use it, process a PDB
>> file with dssp or dsspcmbi, and pass the resulting file to this
>> script. For more information on dssp and BioPerl see the
>> module documentation at http://bioperl.org
>>
>> @email peter.robinson at charite.de
>> 21 January, 2005
>>
>> =cut
>>
>>
>>
>> my $file = "pdb43ca.dssp";
>> my $dssp = new Bio::Structure::SecStr::DSSP::Res('-file'=> "$file");
>>
>> my $pdbID = $dssp->pdbID();
>> my $auth = $dssp->pdbAuthor();
>> my $cmpd = $dssp->pdbCompound();
>> my $pdb_date = $dssp->pdbDate();
>> my $header = $dssp->pdbHeader();
>> my $pdbSource = $dssp->pdbSource();
>>
>> print "PDB entry $pdbID \n\tauthor:\t$auth",
>> "\n\tCompound:\t$cmpd",
>> "\n\tDate:\t$pdb_date",
>> "\n\tHeader:\t$header",
>> "\n\tsource:\t$pdbSource\n\n";
>>
>> my $totalRes = $dssp->numResidues();
>> print "Total residue count (all chains):$totalRes\n";
>>
>>
>> my $surArea= $dssp->totSurfArea();
>> print "Total accessible surface area:\t$surArea (square Ang)\n";
>>
>>
>> my $chainRef = $dssp->chains();
>> my @chains = sort @{$chainRef};
>> print "Chain[s]:\n";
>> foreach my $ch (@chains) {
>> print "\t$ch";
>> }
>> print "\n";
>>
>> my $hb = $dssp->hBonds();
>> print "H BONDS.\n";
>> print "TYPE O(I)-->H-N(J): $hb->[0]\n",
>> "IN PARALLEL BRIDGES: $hb->[1]\n",
>> "IN ANTIPARALLEL BRIDGES $hb->[2]\n",
>> "TYPE O(I)-->H-N(I-5) $hb->[3]\n",
>> "TYPE O(I)-->H-N(I-4) $hb->[4]\n",
>> "TYPE O(I)-->H-N(I-3) $hb->[5]\n",
>> "TYPE O(I)-->H-N(I-2) $hb->[6]\n",
>> "TYPE O(I)-->H-N(I-1) $hb->[7]\n",
>> "TYPE O(I)-->H-N(I+0) $hb->[8]\n",
>> "TYPE O(I)-->H-N(I+1) $hb->[9]\n",
>> "TYPE O(I)-->H-N(I+2) $hb->[10]\n",
>> "TYPE O(I)-->H-N(I+3) $hb->[11]\n",
>> "TYPE O(I)-->H-N(I+4) $hb->[12]\n",
>> "TYPE O(I)-->H-N(I+5) $hb->[13]\n",
>> "\n";
>>
>>
>> foreach my $ch (@chains) {
>> my $ss_elements_pts = $dssp->secBounds($ch);
>> print "Chain $ch:\n";
>> my $pos = 0;
>> my $max = 0;
>> foreach my $stretch (@{$ss_elements_pts}) {
>> my $start = $stretch->[0];
>> my $end = $stretch->[1]; if ($end =~ m/(\d+)/) { $end = $1; }
>> if ($end > $max) { $max = $end; }
>> }
>> ## END is now the last residue in this chain
>> for my $res (1..$max) {
>> my $residueID = $res . ":" . $ch;
>> my ($phi,$psi,$SS,$SSsum,$AA);
>> eval { $phi = $dssp->resPhi($residueID);};
>> eval { $psi = $dssp->resPsi($residueID);};
>> eval { $SS = $dssp->resSecStr($residueID);};
>> eval { $SSsum = $dssp->resSecStrSum($residueID);};
>> $AA = $dssp->resAA($residueID);
>> $phi = $phi || "n/a";
>> $psi = $psi || "n/a";
>> $SS = $SS || "-";
>> my $SSclass;
>> if ($SSsum eq "H") { $SSclass = "helix"; }
>> elsif ($SSsum eq "T") { $SSclass = "turn"; }
>> elsif ($SSsum eq "B") { $SSclass = "beta"; }
>> else { $SSclass = $SSsum; }
>> print "$residueID) [$AA] phi:$phi psi:$psi SecStruct: $SS
>> ($SSclass) \n";
>> }
>> }
>>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at portal.open-bio.org
>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
More information about the Bioperl-l
mailing list