How can Whatsapp insert Emojis into the URL?

Asked

Viewed 4,131 times

21

If you search for "Whatsapp" in Google, the result is this:

[Whatsapp Web

Apparently this "" in the url is a feature that makes the internationalization of the web application. After accessing this page, the visitor is redirected to the site in the correct language.

My curiosity is: how to use an emoji to identify a resource rather than a text like /do?localizacao=pt? Is it a server configuration? URL rewriting?

  • 6

    Whereas Urls can be written with Unicode, it’s easy, right?

  • 3

    This is very interesting! I thought the unicodes were just those "ALT + 1..." that shows a little face, but it seems that now the Emojis have also been standardized. I found this cool, I found this page: http://apps.timwhitlock.info/unicode/inspect?s=%F0%9F%8C%90 Note that the "%F0%9F%8C%90" is the globe design that you mentioned in the question. And if you click on the site, is the globe in the UTF-8 representation.

  • 2

    About the: "Apparently this "" in the url is a feature that makes the internationalization of the web application" - the globe is a character like qq other. could have chosen a "Y" or "Y".

  • 1

    Simple and fast example: http://ninja.net.br/ - It was enough to write an HTML file with this name in the root (I didn’t post in my reply, because I won’t keep this URL in the air indefinitely).

2 answers

21


It doesn’t need anything complicated. Just have your route script redirect emoji to the right place, or even if you create a file or directory whose name is emoji itself.

There is nothing different than a " . php" of an "index.php" or ". php", are mere characters. Even, if the font used by your OS supports this, you will see the globin by the system explorer or shell.

A silly example with PHP, assuming you use friendly URL:

if( $caminho == "blog" ) {
    header( "Location: //example.com/blog.php" );
    exit();
} elseif( $caminho == "" ) {
    header( "Location: //example.com/postagens_felizes.php" );
    exit();
} elseif( $caminho == "" ) {
    header( "Location: //example.com/stevejobs.php" );
    exit();
} elseif( $caminho == "♥" ) {
    echo "eu amo esse site";
... 

Did you notice the " " in the code? It’s the same thing with the globe. I used as an example for being a character much more widespread than the globine, so it’s easier for everyone to read the example.

It is worth noting that the fact that the globe was colored in the result of the search until a while was a filter that exchanged the characters for a version with images, not only to ensure compatibility, but also to meet the aspirations of the new generation "internética". Currently the vast majority of browsers adopt their own set of coloured icons natively.

This happened in other contexts in other large contexts players market, and not just for Urls. (for example, Google Mail now has these boring things Nice little pictures on the "subject" of emails too, it’s been a while).

Of curiosity, these tables in particular usually have colored symbols in virtually all implementations "modern":

https://unicode-table.com/en/blocks/emoticons/

https://unicode-table.com/en/blocks/miscellaneous-symbols-and-pictographs/

Try copying some and pasting in the browser’s address bar.


Beware of the filesystem!

If the filesystem of your OS is encoding differently than the HTTP server, almost always a conversion solves (since the resulting name does not match with special characters of the filesystem). For example, the union of the is equivalent to ♫ in ISO-8859-1. I will not go into too much detail, as this is a mere implementation detail, and does not bear direct relation to the question. In addition, it is very likely that someone using this feature will manage the emoji by programming language, and not with files and folders.

11

is a unicode symbol

U+1F310 =  | GLOBE WITH MERIDIANS
11110000 10011111 10001100 10010000

Use that symbol force user agents, such as Internet Explorers 5 to 8, to encode the HTTP request using UTF-8. Use something like /do?localizacao=pt would not have that effect.

To RFC 3986 requires non-coding ASCII symbols to be recoded in UTF-8 and then expressed in ASCII in the form of percentage code.

is an ASCII incompatible symbol and 4 octets are required to encode it in UTF-8. However, it does not appear in the URI as %F0%9F%8C%90 but in its original form.

This is because the Google results page shows the symbol in its original form to the user. The link performs the following HTTP request:

GET https://web.whatsapp.com/%F0%9F%8C%90/pt-br HTTP/1.1

The browser can also decode the encoded symbols in the URI and display them to the user in the original form. For example, Chromium does this. It is merely cosmetic; if the user copies the URI, he will get the encoded form.

The same occurs on the server side. If it is possible to directly compare the path of the HTTP request with Unicode symbols, then the software should be able to decode the symbols before or during the comparison.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.