[Biojava-l] how to cancel download chemcomp when parser a PDB file

Fico wuuter at gmail.com
Mon Dec 20 05:31:24 UTC 2010


now the question of ChemComp download is OK, but I found a new question when
I test bioJava3-Beta4, my program fragment:

        FileParsingParameters params = new FileParsingParameters();
        params.setLoadChemCompInfo(false);
        params.setHeaderOnly(false);
        // params.setParseCAOnly(true);
        params.setAlignSeqRes(true);
        params.setParseSecStruc(false);

        // loop file
        for (String file : getPdbFiles()) {

            PDBFileReader pdbreader = new PDBFileReader();
            pdbreader.setAutoFetch(false);
            pdbreader.setPath(getPdbDir());

            pdbreader.setFileParsingParameters(params);

            // pdbreader.setLoadChemCompInfo(true);
            Structure struc = null;
            try {
                struc = pdbreader.getStructure(getPdbDir() + "\\" + file);
            } catch (IOException e) {
                e.printStackTrace();
            }

            String pdbid = struc.getPDBCode();

            for (int i = 0; i < struc.nrModels(); i++) {

                // loop chain
                for (Chain ch : struc.getModel(i)) {
                    System.out.println(pdbid + ">>>" + ch.getChainID() +
">>>"
                            + ch.getAtomSequence());
                    System.out.println(pdbid + ">>>" + ch.getChainID() +
">>>"
                            + ch.getSeqResSequence());
                    // Test the getAtomGroups() and getSeqResGroups() method
                    // List<Group> group = ch.getAtomGroups();
                    List<Group> group = ch.getSeqResGroups();
                    for (Group gp : group) {
                        System.out.println(gp.getResidueNumber() + ":"
                                + gp.getPDBName());
                    }
                }
            }
        }

my test PDB file is 1O1G.pdb, there are 45 modified residues in chain A,
when I use .getAtomGroups() I can get all residues' atom information, such
as ResidueNumber and PDBName:
797:PHE
798:LEU
799:MET
800:ARG
801:VAL
802:GLU
......
840:PRO
841:LEU
842:LEU
843:LYS

but use .getSeqResGroups(), the last 45 residues will miss some information,
such as ResidueNumber and atom coordinate, the output of the program is:
797:PHE
798:LEU
null:MET
null:ARG
null:VAL
null:GLU
......
null:PRO
null:LEU
null:LEU
null:LYS

In biojava3-Beta1 the two method produce same result just as
.getAtomGroups() in Beta4. so is it a bug?

P.S.
    Could we add new method to get all amino acid sequence with modifed
residues directly? now both getAtomSequence() and getSeqResSequence() can't
do this, if I want get the amino acid sequence with modifed residues, I had
to use .getAtomGroups() or .getSeqResGroups() first and then loop each
residue to get one letter amino acid sequence.





2010/12/17 Andreas Prlic <andreas at sdsc.edu>

> ok that behavior is fixed in SVN now. Now you can have setAlignSeqRes
> set to true and it will not download chemical components if
> loadChemComp is false. The drawback is that the data representation
> will not be as precise.
>
> Andreas
>
>
>
> On Thu, Dec 16, 2010 at 8:26 AM, Steve Darnell <darnells at dnastar.com>
> wrote:
> > The SeqRes to Atom record alignment forces the use of chemical
> > components to translate non-standard residues to their closest standard
> > counterpart for the sequence alignment.  I have to disable
> > setLoadChemCompInfo and setAlignSeqRes when I don't want to download
> > chemical component files from RCSB when parsing a PDB file.
> >
> > Regards,
> > Steve
> >
> > -----Original Message-----
> > From: biojava-l-bounces at lists.open-bio.org
> > [mailto:biojava-l-bounces at lists.open-bio.org] On Behalf Of Fico
> > Sent: Wednesday, December 15, 2010 8:46 PM
> > To: Biojava-l at lists.open-bio.org
> > Subject: [Biojava-l] how to cancel download chemcomp when parser a PDB
> > file
> >
> > Hi, dear all:
> >
> > I use biojava3 beta1 to parse the PDB files recently, my program is:
> >
> >            PDBFileReader pdbreader = new PDBFileReader();
> >            pdbreader.setAutoFetch(false);
> >            pdbreader.setPath(pdbDirPath);
> >
> >            FileParsingParameters params = new FileParsingParameters();
> >            params.setLoadChemCompInfo(*false*);
> >            params.setHeaderOnly(*false*);
> >            params.setAlignSeqRes(*true*);
> >            params.setParseSecStruc(*false*);
> >            pdbreader.setFileParsingParameters(params);
> >
> >            Structure structure = null;
> >            try {
> >                structure = pdbreader.getStructure(pdbDirPath + "\\" +
> > file);
> >            } catch (IOException e) {
> >                e.printStackTrace();
> >            }
> >
> > when I execute this program, it will download something such as:
> >
> > *creating directory D:\MyWorkspace\TestFiles\pdbFiles\chemcomp
> > downloading http://www.rcsb.org/pdb/files/ligand/35G.cif
> > downloading http://www.rcsb.org/pdb/files/ligand/GDP.cif*
> >
> > but I do not want to lownload those stuff, How can I cancel it?
> > Thanks.
> > _______________________________________________
> > Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/biojava-l
> >
> > _______________________________________________
> > Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/biojava-l
> >
>
>
>
> --
> -----------------------------------------------------------------------
> Dr. Andreas Prlic
> Senior Scientist, RCSB PDB Protein Data Bank
> University of California, San Diego
> (+1) 858.246.0526
> -----------------------------------------------------------------------
>



More information about the Biojava-l mailing list