[BioPython] Bio.PDB : loading Big PDB with segments
Peter (BioPython List)
biopython at maubp.freeserve.co.uk
Tue Aug 1 21:09:22 UTC 2006
Arturas Ziemys wrote:
> Hi,
>
> Whose PDB files are generated by NAMD or VMD. NAMD is molecular
> dynamics programs and VMD for structure manipulation and
> visualization. My modeled systems - and believe the systems of others
> in MD - are big in sense that these PDB files exceeds the limits in
> resid or serials. For example, as far I understant, unification of
> atoms in VMD is made with segment information and it has no problems
> with that.
>
> In my opininion those files follow PDB format. At least I found no
> differences in column structure or column content of PDB. It seems
> that Bio.PDB just takes the segment's identities as some record to
> ATOM entry, but they are meaningless making them unique or original
> if the records with the same serial are met in PDB. After I tryed to
> load those files, I got plenty errors and the "dublicated" entries
> were just skipped.
It sounds like there is just too much data for the original column
widths to hold, and that Bio.PDB simply doesn't understand the
conventions being used.
Hopefully the file format will be extended officially, but I suspect
(without having looked at the data) that these NAMD/VMD files are not
following the strict PDB format.
That's not to say Bio.PDB shouldn't try and support them in permissive
mode. I think this might be a job for the module's author, Thomas
Hamelryck (who is subscribed to this mailing list).
> I could do some "preproccesing" on PDB supplying chain identifier
> foer each segment each time load PDB files and remove supplied chain
> labbels each time on exit. But I am interested is there any another
> way ?
Can you output the data in a different file format? Does mmCIF suffer
from the same limits when dealing with large molecules?
You might also try Konrad Hinsen's Molecular Modelling Toolkit (MMTK).
In my experience its fussier than Bio.PDB for non-standard PDB files,
but on the other hand many of its users may also use NAMD/VMD.
http://www.python.net/crew/hinsen/MMTK/
There is also the Python Macromolecular Library (mmLib) but I have never
tried it myself:
http://pymmlib.sourceforge.net/
> I could attach as an examle, but comppressed file is ~ 1MB,
> uncompressed > 5 MB. If it is OK with the size - I can send a PDB
> file.
Please don't send the file to the mailing list - it would be a bit big.
I suggest you file a bug (include version numbers for Python, BioPython,
NAMD and VMD too), and then choose "create an attachment" and upload the
file - a standard compression like .zip or .taz.gz should be fine.
http://bugzilla.open-bio.org/
Thank you
Peter
More information about the Biopython
mailing list