[Bioperl-l] Parsing PDB entries in BioPerl
Kris Boulez
Kris.Boulez@algonomics.com
Wed, 14 Nov 2001 12:14:39 +0100
Quoting Ewan Birney (birney@ebi.ac.uk):
>
>
The following basic object layout was proposed by Ewan. It looks
reasonable to me (see small remarks).
>
> Bio::Structure::Chain.pm (a particular PDB chain)
you also need
::Model.pm (one Entry can consists of multiple models,
NMR data is like this)
> ::Entry.pm (a structure entry, containing one or more
> chains and an annotation object)
- An Entry then consists of one or more models. Default would be the first/only
model.
- I think the Entry also should contain a Bio::PrimarySeq which would give
people access to the protein sequence (to do similarity searches, ...)
- All the non-coordinate stuff from PDB can be hidden in this annotation
object.
> ::Residue.pm (perhaps - chain made of residues?)
- Most definitely a Residue object is needed. People in this field tend to
think in terms of residues.
- Do we store the 'non-standard' residues (e.g. haem group) ('HET'
records in PDB) in here as well. Or is the name 'residue' in peoples
heads linked to "amino acid residue". I could envision a ::HeteroGroup
object.
> ::Atom.pm (Residue made from Atoms)
- one small question here: what do people expect to happen with atoms
that have alternate locations (i.e. position of the atom could not be
uniquely be defined and n possibilities are given) ? Should the first
alternate be choosen, or the one with the highest occupancy, or ?
We need a place to store the (non standard) links between
residues/hetero groups. Also things like disulfide bridges, cis peptide
bonds (CONECT, SSBOND, LINK, ... from PDB).
::Connect.pm (two (or more) Atoms are connected)
>
> ::IO.pm (Bio::Structure::IO top level file, ala SeqIO)
> ::IO::pdb (pdb IO module)
> ::IO::cif (cif IO module)
> ::IO::mmdb (mmdb IO module)
>
Kris,
--
Kris Boulez Tel: +32-9-241.11.00
AlgoNomics NV Fax: +32-9-241.11.02
Technologiepark 4 email: kris.boulez@algonomics.com
B 9052 Zwijnaarde http://www.algonomics.com/