[Biojava-dev] Why is the Residue Number a String?

Fri Mar 27 16:11:06 UTC 2009

Hi Paul,

The reason is a simple one: Every amino acid in a PDB file is
identified uniquely by 3 things: The chain ID, the residue number and
the insertion code. To make sure one does not forget about the
insertion code, in BJ it is appended to the residue number.

To add to this, residue numbers can be negative, non-consecutive and
also non-sequential. As such it is often easiest, to treat them as
public identifiers and within your own code work with  the internal
atom or group positions...

Andreas

On Fri, Mar 27, 2009 at 8:52 AM,  <tallpaulinjax at yahoo.com> wrote:
>
> Hi,
>
> I was wondering why BioJava treats the ResidueNumber as a String while the PDB format specifies it as an Integer throughout the 3.2 specification? Perhaps it was a string in previous specs? The pertinent code in PDBFileParser is:
> String residueNumber = line.substring(22,27).trim();
> which is then later retrieved by a user as getPDBCode and setPDBCode within a Group, both as strings. BTW, it would seem a more obvious name would be residueName with getters and settors getResidueNumber and setResidueNumber... I always confuse the getPDBCode with returning the 4-character PDB code string!! :-).
>
> Thanks!
>
> Paul
> _______________________________________________
> biojava-dev mailing list
> biojava-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biojava-dev
>