[Biojava-l] PDBFileParser question using PDBID 470D
Steve Darnell
darnells at dnastar.com
Fri Dec 3 23:35:21 UTC 2010
Andreas,
I've been using biojava to gather sequence data from structure files for an internal project. My intent was to test the limitations of my work (hence files similar to 470D), but came across this behavior in biojava.
It is not critical to obtain this particular mapping since it can be derived from the atom records. However, I didn't understand why the SEQRES list would be empty and was looking for clarification. Is it because the chain is RNA and the empty list prevents the unsupported alignment of RNA records?
Regards,
Steve
-----Original Message-----
From: andreas.prlic at gmail.com [mailto:andreas.prlic at gmail.com] On Behalf Of Andreas Prlic
Sent: Monday, November 29, 2010 6:36 PM
To: Steve Darnell
Cc: biojava-l at lists.open-bio.org
Subject: Re: [Biojava-l] PDBFileParser question using PDBID 470D
Hi Steve,
as you already are saying, this is an "exotic" sequence, in the sense
that this is an RNA. The alignments of the SEQRES records for RNA
currently is not supported as of yet. Can you explain a bit more what
you are doing and why you need this mapping in this case?
Thanks,
Andreas
On Mon, Nov 29, 2010 at 12:51 PM, Steve Darnell <darnells at dnastar.com> wrote:
> Greetings,
>
> After parsing PDBID 470D with biojava-3.0-alpha5, Chain A returns an
> empty SEQRES sequence (Chain.getSeqResSequence) and empty SEQRES group
> list (Chain.getSeqResGroups) but the one-letter ATOM sequence is
> properly translated and the ATOM group list contains the appropriate
> number of groups (LoadChemCompInfo set to true).
>
> This is an exotic sequence, but my expectation is that the SEQRES group
> list would have members in it (and one-letter sequence translated if
> LoadChemCompInfo is true). Am I mistaken and the current behavior is
> the intended result?
>
> Best regards,
> Steve Darnell
>
> --
> SEQRES records exist in 470D:
>
> SEQRES 1 A 12 C43 G48 C43 G48 A44 A44 U36 U36 C43 G48 C43 G48
>
> SEQRES 1 B 12 C43 G48 C43 G48 A44 A44 U36 U36 C43 G48 C43 G48
>
>
>
> Sample println output (ln 1 record type, ln 2 get${TYPE}Sequence, ln 3
> get${TYPE}Groups):
>
> SEQRES
> ''
> []
>
> ATOM
> 'CGCGAAUUCGCG'
> [PDB: C43 1 trueatoms: 21, PDB: G48 2 trueatoms: 27, PDB: C43 3
> trueatoms: 24, ...]
>
> _______________________________________________
> Biojava-l mailing list - Biojava-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biojava-l
>
--
-----------------------------------------------------------------------
Dr. Andreas Prlic
Senior Scientist, RCSB PDB Protein Data Bank
University of California, San Diego
(+1) 858.246.0526
-----------------------------------------------------------------------
More information about the Biojava-l
mailing list