Regular expression to replace src attribute contents of html <img> tag

Asked

Viewed 468 times

3

Hello,

I don’t understand almost anything about regular expression, so I wanted you to help me with an expression to replace only what is in double quotes in the src attribute of the html tag, that is, the content of this attribute something like this:

TRegEx.Replace(Str, '(?<=<img\s+[^>]*?src=(?<q>[""]))(?<url>.+?)(?=\k<q>)', 'Nova string');

I had picked up this expression of an issue talking about this same theme in C#, but Delphi is not working.

I look forward to any help.

2 answers

1

I think the expression you seek is:

\<img(.|\n|\r)+src="[^"]*$1"(.|\r.|\n)+\>

Where:

  • \<img is self-explanatory;
  • (.|\n|\r) means any character. The dot means any character that is not a line break. The two other characters are line breaks. The pipe character (|) is the logical operator "or";
  • The plus sign means whatever’s left, at least once, but up to infinity times;
  • [^"]* syphonifies anything other than quotation marks, from zero to infinity;
  • $1 is a way to identify what you want to find. I’m rusty on Delphi, so if anyone knows the right way and it’s not this one, please edit my answer :)
  • thanks so much for the help and explanations, but it didn’t work :-(

1

From what I was seeing it will depend on library of REGEX that you are using

Assuming you’re using the PCRE

REGEX

pattern : (<img.*src=")([^"]*)(".*>)
replace : $1"URL ALTERAÇÃO"$3

See in regex101

OBS

  • "URL CHANGE" is your string to replace the group 2 that is the current url

A regex (?<=<img.*src=")([^"]*)(?=".*>) cannot be used because the look Behind (?<=) does not allow quantifying its composition as the *.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.