Convert Html to readable text using MVC C#

Asked

Viewed 1,044 times

1

I have a text in the database like this: Example only:

<ul>
    <li><strong>&nbsp;asdsdasdadsdadsa <em>asdsd asdsdasdasdad</em></strong>
    <table border="1" cellpadding="1" cellspacing="1" style="width:500px">
        <tbody>
            <tr>
                <td>asdas</td>
                <td>&nbsp;</td>
            </tr>
            <tr>
                <td>&nbsp;</td>
                <td>&nbsp;</td>
            </tr>
            <tr>
                <td>&nbsp;</td>
                <td>&nbsp;</td>
            </tr>
        </tbody>
    </table>
    </li>
</ul>

The information is with the HTML tags, I want to be able to bring this text and display in a readable way on the page. Where these tags disappear and show only the text. I used this code:

 public ActionResult Index(int id)
        {

            QuemSomos model = _repositorio.BuscarPorId(id);
            var quemSomosMapper = Mapper.Map<QuemSomos, QuemSomosViewModel>(model);

            ViewBag.conversao = HttpUtility.HtmlDecode(quemSomosMapper.Texto.ToString());

            return View(ViewBag.conversao);
        }

But it’s not working. I’d like some help if it’s possible.

  • Alysson, you want to remove HTML tags or interpret HTML in the view?

1 answer

1


Try the options below:

1

String result = Regex.Replace(htmlDocument, @"<[^>]*>", String.Empty);

2

HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(Properties.Resources.HtmlContents);
var text = doc.DocumentNode.SelectNodes("//body//text()").Select(node => node.InnerText);
StringBuilder output = new StringBuilder();
foreach (string line in text)
{
   output.AppendLine(line);
}
string textOnly = HttpUtility.HtmlDecode(output.ToString());

3

string html;
// obtain some arbitrary html....
using (var client = new WebClient()) {
    html = client.DownloadString("https://stackoverflow.com/questions/2038104");
}
// use the html agility pack: http://www.codeplex.com/htmlagilitypack
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(html);
StringBuilder sb = new StringBuilder();
foreach (HtmlTextNode node in doc.DocumentNode.SelectNodes("//text()")) {
    sb.AppendLine(node.Text);
}
string final = sb.ToString();

obs. the first option has a gap with CDATA, if it is not the case, you can use it, otherwise use the second.

Source: https://stackoverflow.com/questions/787932/using-c-sharp-regular-expressions-to-remove-html-tags

Browser other questions tagged

You are not signed in. Login or sign up in order to post.