[Bioperl-l] string comparision mismatches and matches

Bernd Web bernd.web at gmail.com
Thu Feb 11 13:59:53 UTC 2010


Hi Mark,

Indeed nice.
Just one question
Why is pack used? It is faster? ^ works on strings too.

$mask = $in ^ $tgt;
$matches = $mask =~ tr/\x0/\x0/;

(btw I had to remove the "" around \x01 in tr)

Regards.

On Thu, Feb 11, 2010 at 2:43 PM, Mark A. Jensen <maj at fortinbras.us> wrote:
> Perfectly described, Torsten. Yes, I confess a certain pride in this
> hack....
> Roopa reports that it sped up her script 3X. cheers MAJ
> ----- Original Message ----- From: "Torsten Seemann"
> <torsten.seemann at infotech.monash.edu.au>
> To: "Mark A. Jensen" <maj at fortinbras.us>
> Cc: "Roopa Raghuveer" <rtbio.2009 at gmail.com>; <bioperl-l at lists.open-bio.org>
> Sent: Thursday, February 11, 2010 6:52 AM
> Subject: Re: [Bioperl-l] string comparision mismatches and matches
>
>
>>> $in = 'ACCTCCTCCTCGAGTATGTG';
>>> $tgt = 'TATCTTGCGCCGGAGATAAT';
>>> $mask = pack("A*",$in)^pack("A*",$tgt);
>>> $matches = $mask =~ tr/"\x0"/"\x0"/;
>>
>> Impressive! Not often you see pack() let alone exclusive-or with a
>> scalar context tr// thrown in for good measure!
>>
>> For those who don't follow what it is doing, here is my (possibly
>> wrong) interpretation: The pack() is converting each of the two (equal
>> length) strings into a byte set. A bit-wise exclusive-or (XOR) is
>> performed between these two byte sets. This will create bytes of value
>> zero (0) where they were the same, and non-zero where they were
>> different. The tr// then counts how many of the bytes were zero (\x0
>> is ascii zero).
>>
>> I'll just assume it is more efficient than for/substr/eq :-)
>>
>> --Torsten Seemann
>> --Victorian Bioinformatics Consortium, Dept. Microbiology, Monash
>> University, AUSTRALIA
>>
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>



More information about the Bioperl-l mailing list