[BioRuby] bio.pdb doubt

Naohisa GOTO ngoto at gen-info.osaka-u.ac.jp
Wed Feb 20 13:53:53 UTC 2008


Dear Shameer,

Information of chains for each macromolecule is described in
'COMPND' record. In BioRuby, Bio::PDB#record method can be used.
Because the information obtained by the method is sometimes
naive, processing of the data would be needed.

Below is a sample program:

  require 'bio'

  def parse_COMPND(pdb)
    molecules = []
    current_molecule = nil
    pdb.record('COMPND')[0].compound.each do |a|
      case a[0]
      when 'MOL_ID'
        current_molecule = {}
        molecules.push current_molecule
      when 'CHAIN'
        chains = a[1].split(/\s*\,\s*/)
        current_molecule[:chains] = chains
      end
      current_molecule[a[0]] = a[1]
    end
    molecules
  end

  pdb1 = Bio::FlatFile.open('pdb1fjg.ent') { |f| f.next_entry }
  pdb2 = Bio::FlatFile.open('pdb1a0d.ent') { |f| f.next_entry }

  [ pdb1, pdb2 ].each do |pdb|
    compounds = parse_COMPND(pdb)
    compounds.each do |c|
      p c['MOLECULE']
      p c[:chains]
    end
  end

The meanings of the 'COMPND' record is described in
PDB file format document:
http://www.wwpdb.org/documentation/format23/sect2.html#COMPND

-- 
Naohisa Goto
ngoto at gen-info.osaka-u.ac.jp /ng at bioruby.org


On Mon, 18 Feb 2008 10:46:03 +0530 (IST)
"K. Shameer" <shameer at ncbs.res.in> wrote:

> Dear Naohisa and Alex,
> 
> Thanks for the links and the sample code.
> I have one more doubt :) .
> Is there any method to check whether a protein is multichain/single chain
> using BioRuby. I checked in BioRuby in Anger document and wiki, but I
> couldnt find it (May be am missing something important)
> 
> Thanks,
> K. Shameer
> NCBS - TIFR
> 
> 
> > You can use Bio::PDB#find_atom or Bio::PDB#find_residue methods.
> >
> >   require 'bio'
> >
> >   # reading PDB data
> >   pdb = Bio::FlatFile.open("pdb1a0d.ent") { |f| f.next_entry }
> >
> >   # using Bio::PDB#find_atom
> >   atoms = pdb.find_atom do |atom|
> >     (atom.chainID == "A" and atom.resSeq >= 22) or
> >     (atom.chainID == "B" and atom.resSeq <= 50)
> >   end
> >   print atoms.to_s
> >
> >   print "\n"
> >
> >   # the same thing can be done by using Bio::PDB#find_residue
> >   residues = pdb.find_residue do |residue|
> >     (residue.chain.id == "A" and residue.resSeq >= 22) or
> >     (residue.chain.id == "B" and residue.resSeq <= 50)
> >   end
> >   print residues.to_s
> >
> >
> 



More information about the BioRuby mailing list