Httputility.Htmldecode(), Html to Txt intention (at least), can anyone help?

Asked

Viewed 56 times

1

I’m getting a record whose content is this below (in HTML):

 
 
<p style="text-align: justify;"><span style="font-family: times new     roman,times;"><span style="font-size: medium;"><strong>EDECPJE N&ordm;    </strong> <strong>0800141-19.2014.4.05.0000 - AGTR</strong></span></span></p>
(...)

But I need to save this in TXT (at least, the ideal was to convert the formatting and save all).

Happy, I tried to use Httputility.Htmldecode(), but to my disappointment he just removed the &nbsp; among other inexpressive tags.

Any idea how I can do this properly?

From now on I thank you all.

  • Your question is not very explanatory, which will make it difficult for you to have an answer that solves the problem. Place the expected result of the HTML snippet you posted. You are aware that &nbsp; represents space without breaks?

1 answer

1


Ok! Friends, thank you for your efforts. I have found a satisfactory solution that I demonstrate below:

System.Text.RegularExpressions.Regex.Replace(text, "<(.|\n)*?>", string.Empty);

This command will not format anything, just remove all tags (HTML/XML) quickly and simply. If you find a more elaborate solution (like remove everything, keep the formatting) I put here.

Thank you again.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.