Your logic is almost certain, I say almost, because it lacks a small interpretation.
In REGEX you should analyze that it can start/end wherever you want, unless you explicitly define how it should behave.
Analyzing what happens
Caption
^
Beginning of the text to be interpreted
$
End of text to be interpreted
Parse 1
<td>Preço<br/>Unit.</td>
^
$
Note that in this hunt the interpreted text has only <
, so REGEX doesn’t hit
Parse 2
<td>Preço<br/>Unit.</td>
^ $
Note that in this hunt the interpreted text is <td>Preç
, so REGEX doesn’t hit
Parse 3
<td>Preço<br/>Unit.</td>
^ $
Note that in this hunt the interpreted text is <br/>Unit.
, if REGEX is the first
( ?<br\/?> ?)(Unit.)
, beats perfectly finding the result, but as is the second
(?! ?<br\/?> ?)(Unit.)
Lookback inhibits the result.
Analyze 4
<td>Preço<br/>Unit.</td>
^ $
Note that in this hunt the interpreted text is Unit.
, if REGEX is the first
( ?<br\/?> ?)(Unit.)
, the result is not found as missing ?<br\/?> ?
at first, but as the 2nd (?! ?<br\/?> ?)(Unit.)
, beats perfectly,
because Lookback says it should not contain ?<br\/?> ?
before (Unit.)
,
and having nothing is valid. Thus returning as valid result.
Possible solution
Using the flag m
to consider each new line \n
as a new text to be interpreted. You can change the REGEX to :
/^(?!.* ?<br\/?> ?Unit\..*)(.*Unit\..*)$/gm
See on REGEX101
Explanation
^...$
- I am saying that the sentence to be analyzed is from beginning to end.
(?!.* ?<br\/?> ?Unit\..*)
- I’m saying if he finds .* ?<br\/?> ?Unit\..*
shall not capture.
(.*Unit\..*)
- Content to be captured.
Addendum
- The best way to think of the denial Lookback (in my view) is to imagine the exact sentence of what it should capture.
- You used
Unit.
in what if you want to capture the .
literal must escape it, otherwise the capture will accept UnitG
, Unit#
, Unit
.
I got it, William. I was really imagining the operation of Lookback incorrectly.
– Marcelo de Andrade
@Marcelodeandrade, I had been worried that the explanation had not turned out well, but I’m glad you understood :D
– Guilherme Lautert