If your page encoding is correct, for example by sending the HTTP header:
Content-Type: text/html; charset=utf-8
Or by specifying it in HTML itself:
<meta charset="utf-8" />
or
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
So at first you don’t have to do anything: the class String
already works with Unicode characters (being represented internally by UTF-16 if I am not mistaken), and itself webserver should be able to convert that string into bytes as per the desired encoding (just make sure the encoding you are declaring is the same encoding as the server is using).
Other encodings besides UTF-8 can be used (Cp1252 / Windows-1252 or ISO-8859-1 / ISO-Latin or some other), but not recommend: UTF-8 is quite universal, and should be understood by any browser prehistoric.
Finally, a comment on the URLEncoder
: what it does is encode a string so that it can be used as a URL. This is not the same thing encode it to be used as contents of an HTML page - if you use this method and then include the result on a page, the user will see the "strange" characters. If you really need to encode your string in ASCII, you need to transform the Unicode characters into HTML entities - which is a distinct process, and I don’t know how to do it in Java. But at first this should not be necessary - and if it is, it should considerably increase the page size generated.
your idea worked. I created a file and wrote in UTF-8, even
– absentia