[Bioperl-l] deleting string fragments?

Andreas Kahari ak at ebi.ac.uk
Fri Apr 2 04:08:56 EST 2004


On Fri, Apr 02, 2004 at 10:33:27AM +0200, Jurgen Pletinckx wrote:
> If you know the location of your substring, rather than the 
> contents, the following bit is useful and fast:
> 
> substr($string,$location, $length) = '';
> 
> Yes, you are assigning an empty string to a substring.
> Yes, this looks evil. It's an established use of substr, 
> though.

Less evil?

    substr($string, $offset, $length, '');

> But if you need to lookup the location first (say, with
> index($string, $substring)), you had better go ahead
> and use the regular expression approach.
> 
> In that case, you don't need to first check whether the
> substring exists, and then edit it out: 
> 
> $string =~ s/$substring//g;
> 
> will happily remove any and all occurences of the sub-
> string, and will do nothing if the subtring is not 
> present. 

The following loop is 10% to 40% faster than regular expression
approach (on the string that the original poster gave as an
example, with varying substrings):

    my $offset = 0;
    while (($offset = index($string, $substring, $offset)) != $[ - 1) {
	substr($string, $offset, length $substring, '');
    }

If speed is important, then I would vote for this approach.



Andreas

-- 
|()()| Andreas Kähäri      EMBL, European Bioinformatics Institute
| () |                     Wellcome Trust Genome Campus
|()()| DAS Project Leader  Hinxton, Cambridgeshire, CB10 1SD
| () | Ensembl Developer   United Kingdom


More information about the Bioperl-l mailing list