[Biopython-dev] Code to submit: CRC64

Sebastian Bassi sbassi at gmail.com
Thu Jun 21 13:57:49 UTC 2007


On 6/21/07, Peter <biopython-dev at maubp.freeserve.co.uk> wrote:
> Please could you fill an enhancement bug, and attach the code to it -

By attach do you mean to include it into the "description" field? Or
is there an attach option in the bug report form that I am missing?

> it makes keeping track of requests and patches much easier.
> Could you also give a couple of examples of how you might use this?

1) Check if the data you have is the same as data in a public DB
without downloading the whole sequences, just download the CRC info
and calculate the CRC with your local sequences and compare them.
There are chances by a random match but it's very low.
2)  You have your own sequences and want to store them in fasta format
and want to include CRC64 in the description, to retrieve it later to
check for consistency.

> In typical usage, does the case of the sequences matter? As it stands

Case matters. AA is checksumed in uppercase and DNA in lowercase. I
will see if I can force this for seq objects (and leave it alone if it
is a plain string).

> Looking at the code, it looks like it would fail when used on
> sequences (Seq objects) where the "letters" are non single characters
> (e.g. sequences using the three letter amino acid codes). This is
> probably not a big problem.

CRC is always calculated in one letter code.

I will correct the other problems.
Best,
SB.


-- 
Bioinformatics news: http://www.bioinformatica.info
Lriser: http://www.linspire.com/lraiser_success.php?serial=318



More information about the Biopython-dev mailing list