Regex with lookbehind does not work in Firefox

Asked

Viewed 87 times

2

I created an Angular project and am using the following regular expression:

export const INTERFACE_REGEX = new RegExp(/(?<=.*\/)(.*?)(?=@|-.+)/gi);

It happens that when using Goggle Chrome the project works normal, but if I try to open by Firefox gives error:

Erro: syntax invalid regex group

I did several tests and found out that if I take the ?<= expression lets so I can compile using firefox.

What would be the equivalent of this phrase (?<=.*\/)?

1 answer

4

Note: when asked, Firefox did not support lookbehind (therefore the error), but currently supports and the error no longer occurs.


First of all, one detail: you don’t have to build the regex this way. When using the bars (which is the literal notation for regular expressions), you are already creating a RegExp, then passing it to the constructor is redundant. That is, the 4 forms below are equivalent:

regex = /expressão/gi;
regex = new RegExp('expressão', 'gi');
regex = new RegExp(/expressão/gi);
regex = new RegExp(/expressão/, 'gi');

But I would only use the first 2 (the first if the expression is "fixed", and the second if you have a string that represents the expression - provided that appropriate care is taken). The last 2 are redundant (the last maybe is useful in cases where flags are dynamic, but remember that it is only valid from Ecmascript 6).

That said, let’s get to the problem itself:


The error occurs because the excerpt (?<=.*\/) is a lookbehind, and at the date this reply was written, Firefox was not supported (but currently has, so you would no longer need to use the solution below, unless you need to support any of the browsers which they do not yet support - so anyway, the alternative is documented here).

Anyway, there is a way to simulate this in any other environment that does not support lookbehind. The idea of lookbehind is to check if something exists before the match current. So just break the regex in two (the part that comes before and the rest). If I find a match, I see if what comes before it corresponds to lookbehind. Sort of like this:

let r_match = /(.?)(?=@|-.+)/gi;
let lookbehind = /.*\/$/; // simula o lookbehind
let match;
let results = [];
while (match = r_match.exec('./a@ ./x-fd')) { // testando com uma string qualquer
  if (match.index == r_match.lastIndex) r_match.lastIndex++;
  // obtém a substring de zero até o índice em que o match ocorre
  let leftContext = match.input.substring(0, match.index);
  if (lookbehind.test(leftContext)) { // simular lookbehind
    results.push(match[1]);
  }
}
console.log(results); // [ 'a', 'x' ]

So the idea is first to check if you have a match. Then I take the substring, from the beginning of the string to the point where the match was found, and I see if it ends with the passage corresponding to lookbehind. For that I added the bookmark $, which means the end of the string. And in this specific case, the regex could be /\/$/- ends with / - for .* means "zero or more characters" and in this case it makes no difference ("end with zero or more characters followed by /" is the same thing as "ending with /").

If you want, you can do push at all match (because the object contains more information, such as the index in which the match, etc). In the case, I chose to just take the character corresponding to (.?).

The if (match.index == r_match.lastIndex) is to correct a bug in the case of zero width Matches (explained in detail here).


Another alternative is not to use lookbehind and get only the capture group corresponding to the information you want:

let regex = /([^\/]*\/)(.?)(?=@|-.+)/gi;
let s = './a@ ./x-fd';
console.log([...s.matchAll(regex)].map(m => m[2])); // [ 'a', 'x' ]

As now the excerpt (.?) is the second pair of parentheses of the expression, it is in group 2, so I used m[2] (but you could eliminate the map if you want an array with pouch).

And I changed the point to [^\/] (any character other than /), because otherwise the regex may end up picking more characters than it should (including the bar itself), giving incorrect results (such as taking only the x, for example).

Although in this case, you don’t need the first parentheses (so the information I want will be in group 1):

let regex = /[^\/]*\/(.?)(?=@|-.+)/gi;
let s = './a@ ./x-fd';
console.log([...s.matchAll(regex)].map(m => m[1])); // [ 'a', 'x' ]

Finally, the flag i serves to leave the regex case insensitive (does not differentiate between upper and lower case letters). But as its regex has no letters, this flag is unnecessary (may leave only the g, which in your case will make no difference).


See more details on "Negative lookbehind only works in Chrome, there is an alternative to other browsers?".

Browser other questions tagged

You are not signed in. Login or sign up in order to post.