[Bioperl-l] SQ Line
James Gilbert
jgrg@sanger.ac.uk
Mon, 31 Jul 2000 17:34:46 +0100 (BST)
Lorenz,
The following bit of code generates SwissProt
style crc32 checksums. I think it is a
bit-reversal of a conventional crc32 (just to be
different, I guess).
I don't know about the molecular weight question
though.
James
{
my( @crcTable );
sub generateCRCTable {
# 10001000001010010010001110000100
# 32
my $poly = 0xEDB88320;
foreach my $i (0..255) {
my $crc = $i;
for (my $j=8; $j > 0; $j--) {
if ($crc & 1) {
$crc = ($crc >> 1) ^ $poly;
}
else {
$crc >>= 1;
}
}
$crcTable[$i] = $crc;
}
}
sub crc32 {
my( $str ) = @_;
die "Argument to crc32() must be ref to scalar"
unless ref($str) eq 'SCALAR';
generateCRCTable() unless @crcTable;
my $len = length($$str);
my $crc = 0xFFFFFFFF;
for (my $i = 0; $i < $len; $i++) {
# Get upper case value of each letter
my $int = ord uc substr $$str, $i, 1;
$crc = (($crc >> 8) & 0x00FFFFFF) ^ $crcTable[ ($crc ^ $int) & 0xFF ];
}
#return sprintf "%X", $crc; # SwissProt format
return $crc;
}
}
On Mon, 31 Jul 2000, L.Pollak wrote:
> (Ewan: Sorry, i wanted to send this also to the list)
>
> I have 2 questions about the SQ Line in swissprot:
>
> does anyone know about CRC calculating used there?
> do i really have to add a CRC to the SQ line ??
>
> does anyone know why the molecular weight in the "SQ" line
> from the "roa1.swiss" samplefile is so different from what i get
> by using Bio::Tools::SeqStats ??
> (file says: 38715, from SeqStats: 45333)
>
> kind regards,
> lorenz
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>
James G.R. Gilbert
The Sanger Centre
Wellcome Trust Genome Campus
Hinxton
Cambridge Tel: 01223 494906
CB10 1SA Fax: 01223 494919