How to capture textarea tag with new line?


Viewed 209 times


How to get the values of a <textarea> with Regex, including new line?

I have the following expression to get a textarea:


Online example.

The problem is that if the textarea have line break, the expression cannot capture it.

Hence my doubt: How to capture the textarea with new line?

  • 2

    Treating HTML with regex is difficult and sometimes the wrong tool. What environment/language are you working on?

  • @Sergio I’m using C#, but my doubt is only with the expression itself. I don’t use Regex to manipulate html, I use the html Agility pack. But we talked about it in chat, just wanted to bring the question to the site too.

  • Randrade, out of curiosity, had some bug in my answer?

  • @Sergio, you I replied in chat, but answering here too: Sorry, I would tell you here in chat the reason I forgot most. Your answer is perfect, you have no problem with it. I changed the acceptance only because the answer of the Guilhermelautert has a more didactic explanation on its part. I thought this would help you to seek. But I’m waiting for the time limit to offer a reward.

  • Okay, fine. His answer is bigger because regex is more complex :) I didn’t know you wanted to separate tags and content. Good content stays here in question and answers, nice.

4 answers


The problem with this REGEX is that by default the . does not include the \n, this way would have to circumvent this lack, may be with denial [^...], that captures anything that is not in the group.

For your need you can do so: <(textarea)([^>]*)>([^%]*?)</\1>.

See working in REGEX101


  • <(textarea) - capture literally < and generates a literal group with textarea, which will be used as a shortcut.
  • ([^>]*)> - will be all attributes of the tag, remembering that attributes do not have > so I used his denial to take everything, finally should end with the tag ending >.
  • ([^%]*?) - here is content to be captured, I used the denial of % 'cause I guess I won’t have it in the middle, but if I do, just switch to another character, for example ¬, remembering that because it is denial includes any and all character that is not in the group including the \n.
  • </\1> - finally it should capture the end of the tag. that was resumed with the group 1 shortcut \1.


You can also use the flag s to allow the .(Dot) captures \n. by changing the REGEX to <(textarea)([^>]*)>(.*?)</\1>.

Remembering that the frag should be applied s.

Example JS

string.match(/<(textarea)([^>]*)>(.*?)<\/\1>/gs); // aqui  foi necessário escapar o `/`, para não ser interpretado como fim da REGEX `<\/\1>`.

See working in REGEX101

  • 1

    You can put an example of the regex working here ?

  • I didn’t understand the downvote, Guilherme sent me the test link and it worked perfectly (is that Guilherme used ~ instead of /)

  • @Good Guilhermelautert. I was just watching, but this way it doesn’t get everything since <textarea until closing the tag with /textarea> that’s right?

  • @Sergio, I don’t know if I got it right, but he captures everything until the tag closes, only he still separates it into groups, and the content is separated from the attributes.

  • Face deserves +100 for the explanation of s That’s a hand on the wheel, then I’ll throw a bounty on you :)


You can do it like this:



The important part is [\s\S]+?, which basically allows all, once or more, and the ? says to be lazy and make the capture in the first opportunity that find.

  • I was trying here: the problem is that it was including everything from the opening of the first to the closing of the last. What part of your expression is stopping this from happening?

  • @Miguel joins a ?

  • 1

    Ha boa... Thanks for the clarification and the alternative

  • 1

    That one +? is what solves my problem with the [\s\S], now I believe it will be easier to port my regex to different interpreters, such as js, java, c#, preg. + 1


The other answers are the right ones and excellent ones, I just made some modifications:

  • I modified the example of @Sergio for:



    This to avoid things like <textarea>abc<textarea> (see that the bar is missing, but in @Sergio’s original regex was getting the match)

  • If you need to match attributes and content separately, do so:


  • I modified the example of @Guillhermelautert for:



    The answer works perfectly, but if you need to use the / will not work well due to </\1>, of course the situation varies in different languages, this is only for a specific situation.

Note: I came to create an example, but my knowledge was more limited, yet it follows the regex:



However the other answers show better and simpler paths, this is just an alternative to study

  • 1

    I was watching your change to my regex and in cases of invalid HTML (which I think is what you want to get around) take two textarea, is that the idea?

  • @Sergio yes that’s it


Hello friend I managed to do that way:




  • <textarea\b[^>]*> Catch the first tag, limit it \b ensures that the tag <textarea. the [^>]* box all characters except > preventing the tag to contain two >>
  • ((\n*|.)*)) Captures Group from tag content. Captures any line break \n* or | captures all characters .*
  • <\/textarea> Ends with closing tag capture
  • 1

    It worked perfectly in all tests I did +1, just missing an explanation about the regex.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.