The question has two points, an objective, if there is a technical impediment, and a more subjective, the impact on the user experience.
Starting with the simplest, there is some technical impediment?
Not today, as I had commented, I have heard that in the past it was not possible (or at least not feasible) to use special characters in filenames (an older user can confirm this), but this is an old problem, there are still certain reserved characters, for example, the /
on Linux, but accents and even emojis are allowed.
It is not a limitation, but a caution that should be taken, if the encoding is not well configured, may occur to receive or save files with the characters �
, ?
, é
, etc, which can lead to bugs and of course not good for the user.
Even if the browser URL shows the special characters (not all) below the screens, they are encoded before the request is made. Obviously a URL like pt.stackoverflow.com/questions/27177/o-que-é-callback
is not valid, the browser encodes it in pt.stackoverflow.com/questions/27177/o-que-%C3%A9-callback
, some browsers do this conversion (although IE has some problems), but the URL shown to the user is the encoded one (again the IE), even in the major browsers there are differences in what is shown in the URL bar, for example in Firefox, shows the space character (
), in Chrome shows encoded (%20
).
Some parsers, like Stack Overflow, which identifies a chunk as a URL and renders it as a link (for example, /
is transformed into <a href="https://[...]" [...]>https://[...]</a>
) may understand that certain special characters are a delimiter of the URL making it is not displayed as desired, as in en.stackoverflow.com/questions/27177/o-que-is-callback (here I am forcing with the markdown).
So even if there is no concrete problem, there are some precautions that can make the removal make sense.
Moving on to the more subjective part, there is impact on user experience?
It does not impact the removal of accents by maintaining the letter without accentuation, or impact very little. We are intelligent beings (or at least we should), so someone who reads the text "o-que-e-callback" understands that that "e" there is actually a "is", since the context helps.
But the complete removal of the character is bad, especially in small words. Even in a context, many may have difficulties and even not understand that in the URL filmes.com/categorias/ao
, "to" is actually "action".
Depending on your user it will not make a difference, illiterate people (for example, children of preschool age) will obviously not read the link, just click to find out (or not) what this is about.
In some specific cases it can cause some confusion, similar to the case of bad/unused commas, for example, noticias.com/pais-sem-dinheiro
can be "parents without money" (legal guardians) or "country without money" (homeland, nation). Other specific cases may require you to maintain the accent, for example, an online dictionary will have the Urls dicionario.com/esta
and dicionario.com/está
, which refer to totally different words.
I’m new in the area, but I believe that in the past there were problems naming files with special characters. Although the browser shows the special character, the URL is encoded and is sent "%C3%A9" instead of "is". Even today if you copy the URL you will receive encoded. You may also have problems if the charset is not configured correctly, the same problem that occurs when you see an interrogation (?) instead of the accents, then instead of saving a "is-[...]" file, it is saved "? -[...]".
– Costamilam
Probably because this decision to trade
á
fora
or remove is due to encoding,á
will be coded as%E1
and in UTF-8 will be%C3%A1
, then there will be the problem of solving it at all times, take into account which part of the URL can be used in the query, of a document or bank, being bank depending on the encoding another example that will be equivalent would beß
andss
, what could conflict with other things, on an "international" website (which uses more than one language) ...– Guilherme Nascimento
.... not working with accents is always easier, it’s not just a matter of UX (for multi-language websites), it’s an issue that technically within the site system facilitates (of course, if you know what you’re doing). There is of course the fad to talk that we should always use UTF-8 because it is better (lie told by a staff there), on a site that does not need emoji and only accents, latin1 (and equivalent) will solve very well.
– Guilherme Nascimento