Complementing the other answers, another alternative is:
const url = "https://teste.teste.pt/sites/teste/Normativo/NormasDeProcedimentos/Documents/Histórico/";
const regex = /\/Normativo\/([^\/]+)/;
console.log(regex.exec(url)[1]); // NormasDeProcedimentos
The excerpt \/Normativo\/
checks if there is the word "Normative" between two bars.
Then I use it [^\/]+
:
[^\/]
: The ^
between brackets means "any character that is not inside the brackets". In this case, we only have the bar (properly escaped with \
not to be confused with the regex delimiters). Therefore, this expression means "any character other than /
"
- the quantifier
+
means: one or more occurrences.
This whole section is in parentheses to form a catch group. And since it’s the first pair of parentheses, that means any stretch will be captured in group 1.
Then I use the method exec
, that returns the match, and picked up position 1, which corresponds to the first capture group. The result will be "Standard".
Going a little further...
Use .+?
instead of [^\/]+
also works. This only starts to make a difference if we have a URL that does not satisfy the expression.
For example, it was unclear whether the URL could be just: https://teste.teste.pt/sites/teste/Normativo/NormasDeProcedimentos
The expression .*\/Normativo\/(.+?)\/.*
makes a bar mandatory after NormasDeProcedimentos
, then to the above URL it would fail. Only .+?
means "one or more occurrences of whichever character", where the ?
means "the minimum of characters satisfying the expression".
This means that regex will test several possibilities before failing (since .
means "any character", meaning there is an enormous amount of possibilities to be tested).
I made a test of this regex in regex101.com, and if you get into the mode of debug, will see that the regex back and forth several times in the string, verifying several possibilities in several different positions of the same. On this screen you can use the keyboard (arrows to the right and left to go forward and back, being able to see what the regex does with each step). When a red arrow appears pointing to the left, this represents a backtracking, that is, an attempt by regex to return some string positions and test new possibilities.
At this same link, note also on the left side: it indicates that regex took more than 4500 steps to realize that the string does not satisfy the expression. This is thanks to .+?
, and also because of .*
at the beginning and end of the expression. As the point means "any character", and the quantifiers +
and *
does not have a maximum limit, regex tries all possibilities (with 1, 2, 3... n characters), until it realizes that none match can be found.
On the other hand, let’s see what happens if we use \/Normativo\/([^\/]+)\/
(note that I added a bar at the end, only so that it is mandatory and regex fails to the URL https://teste.teste.pt/sites/teste/Normativo/NormasDeProcedimentos
).
I also put it in regex101.com, and see that she needs much less steps (about 90) to realize that there is a match. That’s because I removed the .*
beginning and end (for I am only interested in what I have after "/Normative/"), and I have explicitly put what I want ([^\/]+
- anything but the character /
).
This difference happens because lazy quantifiers (such as the .+?
), although very useful for cases like this, have their price. And use stitch .
It’s very tempting, but it’s not always what you need. The dot means "any character", but you don’t want any character, you want "any character other than /
", then the best is always explicitly say what you want and what you don’t want.
Of course for small programs, where regex will run a few times, and especially for cases where a match, the difference in performance will be irrelevant. But it’s important to keep those details in mind, because there are cases where this can make a difference.
Also, remember that the exact amount of steps depends on the engine and the input strings. But the difference between the expressions remains more or less the same (the version with [^\/]
will always be faster than .+?
).
Why not validate the URL?
Since the input is a URL, you could use the object URL
and obtain only the pathname
:
let url = 'https://teste.teste.pt/sites/teste/Normativo/NormasDeProcedimentos/Documents/Histórico/';
let path = new URL(url).pathname;
console.log(path); // "/sites/teste/Normativo/NormasDeProcedimentos/Documents/Histórico/"
// usar a regex com a string path (em vez de usar a URL inteira)
const regex = /\/Normativo\/([^\/]+)/;
console.log(regex.exec(path)[1]); // NormasDeProcedimentos
With this, you validate whether the input is a valid URL and still get a smaller string for regex to evaluate, making it run a little faster. Again, for a few executions, the difference will be irrelevant, but it may be that the validation made by new URL
worthwhile, because then you do not accept simply any string. It is up to you to use or not.
In question you said you want to return NormasDeProcedimento/
(with the bar at the end). Therefore, you just need to include this bar in the regex (and within the parentheses, so that it is already available in the capture group).
const url = "https://teste.teste.pt/sites/teste/Normativo/NormasDeProcedimentos/Documents/Histórico/";
const regex = /\/Normativo\/([^\/]+\/)/;
console.log(regex.exec(url)[1]); // NormasDeProcedimentos/
You have to capture the url or it is static?
– Marconi
I have this code to remove
var str = window.location.href;
– KmDroid
By removing you mean removing that part of the string or returning it only?
– fernandosavio
In this case it would remove from the "Normative/" forward until the next "/" and would be in this case "Standard/"
– KmDroid
@fernandosavio Return only "Standard/"
– KmDroid