[Biojava-l] PDBFileReader vs MMCIFFileReader - Parsing chain data into BioJava data structure

Phelelani Mpangase pmpangase at gmail.com
Tue Jul 2 08:13:50 UTC 2013


Hi all

I am using the MMCIFFileParser class to parse the mmCIF files, which I
understand should parse the mmCIF files into the same BioJava data
structure as the PDB files. However, it seems as if some of the data in the
mmCIF files is not loaded into the BioJava data structure.

I was looking to obtain some information for each the chains in mmCIF files
(organism, expression system etc) but I noticed that when I use the class
getChains() on the structre parsed using the MMCIFFileParser, I always get
"null" results. Here is code below demonstrating the results of the same
PDB ID parsed with the PDBFileReader as well as the MMCIFFileReader:

        String pdbid = "4hhb";
        String dataDownload = "/tmp/";

        //pdb parser
        PDBFileReader pdbreader = new PDBFileReader();
        pdbreader.setAutoFetch(true);
        pdbreader.setPath(dataDownload);

        //mmCIF parser
        StructureIOFile cifreader = new MMCIFFileReader();
        cifreader.setAutoFetch(true);
        cifreader.setPath(dataDownload);

        //Parse pdb and cif structures into BioJava data structure
        Structure pdbstruc = pdbreader.getStructureById(pdbid);
        Structure cifstruc = cifreader.getStructureById(pdbid);

        System.out.println("Chain information for PDB structure " + pdbid +
":");

        //Get the chain information
        List<Chain> pdbChains = pdbstruc.getChains();
        for (Chain chain : pdbChains){
            System.out.println(chain.getHeader());
        }


System.out.println("----------------------------------------------------");
        System.out.println("\nChain information for mmCIF Structure " +
pdbid + ":");

        //Get the chain information
        List<Chain> cifChains = cifstruc.getChains();
        for (Chain chain : cifChains){
            System.out.println(chain.getHeader());
        }

System.out.println("----------------------------------------------------");


Are there additional classes I need to be using in order for the chain
information in mmCIF files to be parsed into the BioJava data structure at
the PDB files?

Regards,
Phele



More information about the Biojava-l mailing list