[Biojava-l] how to cancel download chemcomp when parser a PDB file
Andreas Prlic
andreas at sdsc.edu
Tue Dec 21 23:55:39 UTC 2010
Hi Fico,
- you are right, this was a bug (some index was off). I committed a
patch for this to SVN.
- I also added new behaviour for downloading chem comp files: The
default chem comp provider will fetch the components.cif.gz file and
extract all definitions into small files, which will be used from then
on.
- not sure about your last question. That is kind of already possible
I believe. You can use the getChemComp method to get the exact
definition for a group.
Andreas
On Sun, Dec 19, 2010 at 9:31 PM, Fico <wuuter at gmail.com> wrote:
> now the question of ChemComp download is OK, but I found a new question when
> I test bioJava3-Beta4, my program fragment:
>
> FileParsingParameters params = new FileParsingParameters();
> params.setLoadChemCompInfo(false);
> params.setHeaderOnly(false);
> // params.setParseCAOnly(true);
> params.setAlignSeqRes(true);
> params.setParseSecStruc(false);
>
> // loop file
> for (String file : getPdbFiles()) {
>
> PDBFileReader pdbreader = new PDBFileReader();
> pdbreader.setAutoFetch(false);
> pdbreader.setPath(getPdbDir());
>
> pdbreader.setFileParsingParameters(params);
>
> // pdbreader.setLoadChemCompInfo(true);
> Structure struc = null;
> try {
> struc = pdbreader.getStructure(getPdbDir() + "\\" + file);
> } catch (IOException e) {
> e.printStackTrace();
> }
>
> String pdbid = struc.getPDBCode();
>
> for (int i = 0; i < struc.nrModels(); i++) {
>
> // loop chain
> for (Chain ch : struc.getModel(i)) {
> System.out.println(pdbid + ">>>" + ch.getChainID() +
> ">>>"
> + ch.getAtomSequence());
> System.out.println(pdbid + ">>>" + ch.getChainID() +
> ">>>"
> + ch.getSeqResSequence());
> // Test the getAtomGroups() and getSeqResGroups() method
> // List<Group> group = ch.getAtomGroups();
> List<Group> group = ch.getSeqResGroups();
> for (Group gp : group) {
> System.out.println(gp.getResidueNumber() + ":"
> + gp.getPDBName());
> }
> }
> }
> }
>
> my test PDB file is 1O1G.pdb, there are 45 modified residues in chain A,
> when I use .getAtomGroups() I can get all residues' atom information, such
> as ResidueNumber and PDBName:
> 797:PHE
> 798:LEU
> 799:MET
> 800:ARG
> 801:VAL
> 802:GLU
> ......
> 840:PRO
> 841:LEU
> 842:LEU
> 843:LYS
>
> but use .getSeqResGroups(), the last 45 residues will miss some information,
> such as ResidueNumber and atom coordinate, the output of the program is:
> 797:PHE
> 798:LEU
> null:MET
> null:ARG
> null:VAL
> null:GLU
> ......
> null:PRO
> null:LEU
> null:LEU
> null:LYS
>
> In biojava3-Beta1 the two method produce same result just as
> .getAtomGroups() in Beta4. so is it a bug?
>
> P.S.
> Could we add new method to get all amino acid sequence with modifed
> residues directly? now both getAtomSequence() and getSeqResSequence() can't
> do this, if I want get the amino acid sequence with modifed residues, I had
> to use .getAtomGroups() or .getSeqResGroups() first and then loop each
> residue to get one letter amino acid sequence.
>
>
>
>
>
> 2010/12/17 Andreas Prlic <andreas at sdsc.edu>
>>
>> ok that behavior is fixed in SVN now. Now you can have setAlignSeqRes
>> set to true and it will not download chemical components if
>> loadChemComp is false. The drawback is that the data representation
>> will not be as precise.
>>
>> Andreas
>>
>>
>>
>> On Thu, Dec 16, 2010 at 8:26 AM, Steve Darnell <darnells at dnastar.com>
>> wrote:
>> > The SeqRes to Atom record alignment forces the use of chemical
>> > components to translate non-standard residues to their closest standard
>> > counterpart for the sequence alignment. I have to disable
>> > setLoadChemCompInfo and setAlignSeqRes when I don't want to download
>> > chemical component files from RCSB when parsing a PDB file.
>> >
>> > Regards,
>> > Steve
>> >
>> > -----Original Message-----
>> > From: biojava-l-bounces at lists.open-bio.org
>> > [mailto:biojava-l-bounces at lists.open-bio.org] On Behalf Of Fico
>> > Sent: Wednesday, December 15, 2010 8:46 PM
>> > To: Biojava-l at lists.open-bio.org
>> > Subject: [Biojava-l] how to cancel download chemcomp when parser a PDB
>> > file
>> >
>> > Hi, dear all:
>> >
>> > I use biojava3 beta1 to parse the PDB files recently, my program is:
>> >
>> > PDBFileReader pdbreader = new PDBFileReader();
>> > pdbreader.setAutoFetch(false);
>> > pdbreader.setPath(pdbDirPath);
>> >
>> > FileParsingParameters params = new FileParsingParameters();
>> > params.setLoadChemCompInfo(*false*);
>> > params.setHeaderOnly(*false*);
>> > params.setAlignSeqRes(*true*);
>> > params.setParseSecStruc(*false*);
>> > pdbreader.setFileParsingParameters(params);
>> >
>> > Structure structure = null;
>> > try {
>> > structure = pdbreader.getStructure(pdbDirPath + "\\" +
>> > file);
>> > } catch (IOException e) {
>> > e.printStackTrace();
>> > }
>> >
>> > when I execute this program, it will download something such as:
>> >
>> > *creating directory D:\MyWorkspace\TestFiles\pdbFiles\chemcomp
>> > downloading http://www.rcsb.org/pdb/files/ligand/35G.cif
>> > downloading http://www.rcsb.org/pdb/files/ligand/GDP.cif*
>> >
>> > but I do not want to lownload those stuff, How can I cancel it?
>> > Thanks.
>> > _______________________________________________
>> > Biojava-l mailing list - Biojava-l at lists.open-bio.org
>> > http://lists.open-bio.org/mailman/listinfo/biojava-l
>> >
>> > _______________________________________________
>> > Biojava-l mailing list - Biojava-l at lists.open-bio.org
>> > http://lists.open-bio.org/mailman/listinfo/biojava-l
>> >
>>
>>
>>
>> --
>> -----------------------------------------------------------------------
>> Dr. Andreas Prlic
>> Senior Scientist, RCSB PDB Protein Data Bank
>> University of California, San Diego
>> (+1) 858.246.0526
>> -----------------------------------------------------------------------
>
>
--
-----------------------------------------------------------------------
Dr. Andreas Prlic
Senior Scientist, RCSB PDB Protein Data Bank
University of California, San Diego
(+1) 858.246.0526
-----------------------------------------------------------------------
More information about the Biojava-l
mailing list