What is the best way to convert HTML entities with Javascript?

Asked

Viewed 656 times

2

Through researches in the English OS, I learned to do decoding and encoding of HTML entities as follows:

var wm = (function(wm){ 
   wm.encodeHTML = function (html) {
        var t = document.createElement('textarea');
        t.innerHTML = html;
        return t.innerHTML;
   }    
   wm.decodeHTML = function (html) {
        var t = document.createElement('textarea');
        t.innerHTML = html;    
        return t.value;
   }       
}({}));

I would like a more elegant solution (regular expression or anything else) to convert HTML entities instead of creating one textarea and return its value.

Anyone can help?

  • Would something like this happen? http://stackoverflow.com/questions/1229518/javascript-regex-replace-html-chars

  • 1

    Look "more elegant" would not be a way and RegEx may help a lot, but it does not suit all objectives, in your own example you used DOM (createElement) this is totally acceptable as html entities are also part of the DOM, and believe your code will have well acceptable ("perhaps" better than any other technique). Therefore html entity are part of the DOM, I see no harm in using their own DOM to convert them. Another thing, the way you made the code gets much smaller and in my opinion is the indentation combined with small and organized codes that will make elegance :)

  • Just another detail Wallace, your function wm.decodeHTML = function (html) { is not closed, one is missing } after the return

3 answers

3


I use two functions I found on the internet, take a test:

//https://stackoverflow.com/questions/1219860/html-encoding-in-javascript-jquery
function htmlEncode(value) {
    //create a in-memory div, set it's inner text(which jQuery automatically encodes)
    //then grab the encoded contents back out.  The div never exists on the page.
    return $('<div/>').text(value).html();
}

function htmlDecode(value) {
    return $('<div/>').html(value).text();
}

Or as the author preferred with Regex:

var replaceHtmlEntites = (function() {
    var translate_re = /&(nbsp|amp|quot|lt|gt);/g,
        translate = {
            'nbsp': String.fromCharCode(160), 
            'amp' : '&', 
            'quot': '"',
            'lt'  : '<', 
            'gt'  : '>'
        },
        translator = function($0, $1) { 
            return translate[$1]; 
        };

    return function(s) {
        return s.replace(translate_re, translator);
    };
})();

Source: https://stackoverflow.com/questions/1229518/javascript-regex-replace-html-chars

  • 1

    It’s the same thing I did, but with jQuery. I am looking for other solutions (such as regex)

  • I don’t understand the negativity, because you said "...regexp or anything else" (MAXTERS, 2015).

  • I also see no need for this negative vote, it sounds kind of bad, but I think it’s because you posted a code already with the same functionality and he asked for something with RegEx instead of creating a textarea (in case it should have referred to the use of DOM). I think you should just edit the answer :)

  • @Guilhermenascimento Thank you, I did just that! :)

  • Only one thing to help Rafael, the second function only "decodes", I think the author needs both. : ) But even so +1

  • I removed the -1 since the answer was changed. Very good! looks a little bit like the underscore source code :)

  • Yes, it would need a small adaptation to the reverse case; but I believe it is already a positive step towards the solution that the author would like; Thank you William and Wallace.

Show 2 more comments

3

I don’t quite understand but what you’re trying to do is turn html into text and text into html? because if it is you have this way using replace.

var string="<div>oi</div>";
string.replace(/</g,"&lt;").replace(/>/g,"&gt;"); // &ld;div&gd;oi&ld;/div&gd;

And to get back to html:

var string="&ld;div&gd;oi&ld;/div&gd";
string.replace(/&lt;/g,"<").replace(/&gt;/g,">"); // <div>oi</div>

1

This can also be done (code taken from mustache.js):

var entityMap = { // Lista de entidades
    "&": "&amp;",
    "<": "&lt;",
    ">": "&gt;",
    '"': '&quot;',
    "'": '&#39;',
    "/": '&#x2F;'
};

function escapeHtml(str) {
  return String(str).replace(/[&<>"'\/]/g, function (s) {
    return entityMap[s];
  });
}

The expression [&<>"'\/] will match any character in the list &<>"'\/, if the replace succeed will be returned through entityMap the converted value.

Fiddle

To do the reverse path just reverse the list order and the expression:

var entityMap = { // Lista de entidades
    '&amp;': '&',
    '&lt;': '<',
    '&gt;': '>',
    '&quot': '"',
    '&#39;': "'",
    '&#x2F;': '/'
};

function unescapeHtml(str) {
  return String(str).replace(/&amp;|&lt;|&gt;|&quot|&#39;|&#x2F;/g, function (s) {
    return entityMap[s];
  });
}

Fiddle

Browser other questions tagged

You are not signed in. Login or sign up in order to post.