Us comments you said you want to take out the formatting (the string " " should become "My team Anonimo").
The problem is that this string you are using is not exactly formatted text. Not in the sense of having any HTML tags formatting it. In fact, this text is using characters other than the letters of our alphabet:
let s = " ";
// imprimir os code points da string
console.log(Array.from(s).map(c => c.codePointAt(0).toString(16)));
In the code above, I am printing the code points. To understand the details about code points, suggest this - long - reading, but briefly, each existing character (be it letters, numbers, spaces, punctuation marks, mathematical symbols, etc.) has a unique numerical value, determined by Unicode.
If you run the above code, you will see that the first elements of the array are code points "1d474" and "1d486" (the values were printed in hexadecimal). The first corresponds to the character "MATHEMATICAL BOLD ITALIC CAPITAL M", that is the letter "M" uppercase "stylized" in italics and bold. The second is the character "MATHEMATICAL BOLD ITALIC SMALL E".
They are different characters from the letters "M" (whose code point is U+004D) and "e" (code point U+0065). The characters "" and "", although visually similar to the letters "M" and "e", are not the same characters, as they have different code points. And more importantly, as much as they look like just an "M" and an "E" formatted, they’re not exactly that. Because if you write "", without any training, they will already be like this, in "italics and bold", but if you apply this formatting, they will be even "more italics and bold" (see this example in Google Docs):
Same example in HTML:
<!-- Caracteres ASCII -->
<p>Meu TIME Anonimo</p>
<p><b><i>Meu TIME Anonimo</i></b></p>
<!-- Caracteres Unicode (Mathematical Letters) -->
<p> </p>
<p><b><i> </i></b></p>
Notice how they are rendered differently. Unicode characters, even without any formatting, are already "bold and italic" even without tags <b>
and <i>
, and with the tags, they are even "more bold and italics" (more "thick" and "inclined").
As such, you don’t exactly want to "take out the formatting", but convert these characters to their ASCII equivalents. To do this, you can use the method normalize
:
let s = " ".normalize('NFKC');
console.log(s);
// imprimir os code points da string
console.log(Array.from(s).map(c => c.codePointAt(0).toString(16)));
Note that the text has now been printed with ASCII characters ("not formatted"): Meu TIME Anonimo
, and the first code points are "4d" and "65", which correspond to the letters "M" and "and".
You can apply normalization after removing the emojis (using the solution proposed by Mauroalmeida), so your text will be "clean" the way you need it.
To understand a little more about Unicode normalization, read here, here and here - and more details can be found on Unicode document describing normalization.
Yes! Without the emojis and respecting the
font-family
tag– Cristiano Gilberto João
If it is possible to change before storing in the database it would be good.
– Cristiano Gilberto João
@Mauroalmeida made some changes to the question to try to make it clearer.
– Cristiano Gilberto João
I’ve answered below, I hope it helps ;-)
– MauroAlmeida