What is the correct way to make a regular Javascript substitution for all occurrences found?

Asked

Viewed 66,720 times

38

What is the correct way to make a regular Javascript substitution for all occurrences found?

The way I do these days:

var i = 0;
while ((i = str.indexOf("_", i)) != -1) {
    str = str.replace("_", " ");
}

Or even:

str = str.split("_").join(" ");

They don’t seem to be the most adequate.

4 answers

57


  • I did tests with the 3 ways and found that this is really the fastest. Using split is very fast too, but with regex is better.

  • @Danielt.Sobrosa Yes, regex is a good tool to have in the belt. = ) Btw, thanks for editing -- I had to apply it manually because the revision interface bugged. I don’t know if it was something local -- if it shows up again, I’ll report it.

  • 4

    Be careful only if your string contains characters with special meaning in regular expressions (., +, *, [ etc.). In this solution, such characters must be "escaped" with ''. See http://stackoverflow.com/questions/3561493/is-there-a-regexp-escape-function-in-javascript

  • @rodrigorgs Yes, in these cases it is good to test there in Regexpal, so I’ve put the link. =)

  • Okay, thank you very much.

  • It’s not standard but it’s nice to remember that you can pass the third parameter as a regular expression flag in the replace method, for example: str = str.replace('_', ' ', 'g'); works like the example, but some non-modern browsers may not show the expected behavior (there is Shim to fix this).

Show 1 more comment

15

Here is a way:

String.prototype.replaceAll = String.prototype.replaceAll || function(needle, replacement) {
    return this.split(needle).join(replacement);
};

Just put it before any other script you use replaceAll. Use it as follows:

 var novaString = 'foo bar foo'.replaceAll('foo', 'baz');
 console.log(novaString); //"baz bar baz"

While putting the function in the prototype is very convenient, there are a few reasons not to do so -- if a library, script or new Ecmascript specification defines another String.prototype.replaceAll with different signature or behavior there would be conflict. We can convert this into a simple function to be more future-proof:

function replaceAll(str, needle, replacement) {
    return str.split(needle).join(replacement);
}
console.log( replaceAll('foo bar foo', 'foo', 'baz') ); //"baz bar baz"

What’s wrong with using regex?

None, really. However, I believe it is much easier to work with strings. If you want to create a regex from an arbitrary string, it is necessary to escape all metacharacters. Another method using regex:

String.prototype.replaceAll || (function() {
    var regMetaChars = /[-\\^$*+?.()|[\]{}]/g;
    String.prototype.replaceAll = function(needle, replacement) {
        return this.replace(new RegExp(needle.replace(regMetaChars, '\\$&'), 'g'), replacement);
    };
}());

But it is not easier to write a literal regex?

This will depend on your use case. If you don’t know much about Regex, you might encounter syntax errors or unexpected results if you don’t escape the metacharacters properly. For example:

'a.a'.replace(/./g, ','); //Qual o resultado?

A user with no Regex experience would expect "a,a", but since the endpoint is a meta-character that represents a character class containing all characters (except line breaks), the result of the above expression is ",,,". A correct regex would be /\./g, which would replace only the character ..

Even though you have full knowledge of all metacharacters that need to be escaped, another important point is when the text to be replaced (Needle) is a variable whose content may be unknown. So it is necessary to escape all possible metacharacters through one more replace before passing it on to the builder RegExp (since it is not possible to place a variable inside the literal syntax of Regexp objects).

It is therefore easier to use a function when its use case requires something more complex.


Wouldn’t it be easier just to loop while the Needle is found?

There is a problem with the OP code. Let’s put it in a function and analyze:

function replaceAll(str, needle, replacement) {
    var i = 0;
    while ((i = str.indexOf(needle, i)) != -1) {
        str = str.replace(needle, replacement);
    }
    return str;
}

The indexOf and replace of each iteration begin scanning the string from its beginning. This generates bugs in certain cases:

replaceAll('bbaa', 'ba', ''); //Qual o resultado?

We would expect the method to find the Needle (highlighted in square brackets) b[ba]a, replace it with an empty string, find no more Needle the part of the current position and return ba. However, as the scan resumes from the beginning of the string at each iteration, this function finds ba a second time and the return is an empty string.

To fix this it is necessary to pass the i as the second argument of indexOf() (fromIndex), and since the replace() does not have a parameter fromIndex it would be necessary substring/substr/slice to simulate it, which would make the code a little more complex, but also functional:

String.prototype.replaceAll = String.prototype.replaceAll || function(needle, replacement) {
    var str = this,
        i = 0,
        l = needle.length;
    while (~(i = str.indexOf(needle, i))) {
        str = str.substr(0, i) + str.substr(i+l);
    }
    return str;
};
  • 1

    What’s this "p(r)olyfill" story? And welcome to the site!

  • 2

    @bfavaretto good, you know that polyfill is a code to implement standardized functionality that the browser has not implemented natively, whereas prolyfill is similar only that without the functionality have been standardized yet. :) The prolyfill serves more to promote the standardization of the same, but really, I think I ended up forcing a little on this. I’ll rewrite that part later. And thank you for the welcome. =]

  • 1

    Hmmm... ignorance about regex doesn’t have much to do with "use case". For those who do not know regex, it is good that learn -- will be useful for the rest of life, and gives much more power to complex cases. On the escape of metacharacters, an interesting tip is to know that most characters (except ^ and -) function as literals within [].

  • @I agree! I can’t imagine myself without regex nowadays (especially since I also work in the back end). I believe that several front-focused Developers do not see much use for regex, although it has its utilities even on the front. The use case I had in mind would be to use an arbitrary string (unknown content variable) such as Needle since it is not possible to write a literal regex with it and it is necessary to escape the meta-characters, which is much more complex than the use case where the Needle is fixed and just write a simple literal regex.

  • @Fabríciomatté Got it! To replace a dynamic pattern it really makes more sense to use a traditional string replacement.

3

The most correct (and recommended) way is how @Lias responded, via regex.

As a curiosity, the Firefox engine (Spidermonkey) implements an extra parameter in the method replace called flags who accepts the same flags normally used with regex.

Example:

>>> "Ah é natal... Feliz Natal!".replace("natal", "ano novo", "gi")  
"Ah é ano novo... Feliz ano novo!"

I do not recommend using this method, it is not standard. Probably won’t work in Internet Explorer or Chrome (V8).

0

A simple way would be:

var boia1 = 'Posição+Projeto+-+Boia+8'
    while(boia1.includes('+'))
    {
        boia1 = boia1.replace('+', ' ');
    }

    //Result: 'Posição Projeto - Boia 8'
  • 3

    Didn’t you confuse the sites? This is the [en.so]

  • Comment changed to English.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.