Bioperl: relative-majority consensus, fast code sought

Andrew Dalke dalke@bioreason.com
Tue, 30 Mar 1999 09:25:22 -0800


Gustavo Glusman <bmgustav@bioinformatics.weizmann.ac.il> said:
Shouldn't this be faster?
> $_ = $string;
> foreach $letter (qw/A T C G/) { $count{$letter} = tr/$letter//; }

Matthew Pocock already pointed out that since the string gets
shorter, perl will have to rebuild the string at every deletion
which will slow things down quite a bit.

(Though once the reference count of the string becomes 1, the
code can just shorten the string by memcpy'ing everything from pos+1
to the end, then reducing the length by one.)

Still, your code is much more maintainable than the others proposed.
Something like

| foreach $letter (qw/A T C G/) { $count{$letter} = tr/$letter/$letter/; }

(if valid perl; didn't test it) would mean you can support other
character sets much more easily than the old way, which listed
every character on a different line embedded in a bunch of code,
and duplicating the letter several times.

						Andrew
						dalke@bioreason.com
=========== Bioperl Project Mailing List Message Footer =======
Project URL: http://bio.perl.org/
For info about how to (un)subscribe, where messages are archived, etc:
http://www.techfak.uni-bielefeld.de/bcd/Perl/Bio/vsns-bcd-perl.html
====================================================================