It is perfectly possible to work with several encodings on the same server. I’ve even been doing this with Apache and PHP for a long time, no problems.
Note that to run two instances of Apache you would need two different Ips, or use non-standard ports.
I really like to take advantage of the advantages of ISO-8859-1, which is more than suitable for the Portuguese language, and has an extremely simple and effective architecture, which does not interfere with other applications that use UTF-8 on the same server.
So, I have listed some points to be observed for you to implement in your "multi-charset" solution. You can bring together almost all concepts in one include
at the beginning of all your pages, facilitating maintenance.
Come on:
Defining the general pattern
Usually the settings that determine the default in Apache and PHP are
AddDefaultCharset UTF-8
and
default_charset = "utf-8"
As the name already says in both cases, this is the setting default, And it only counts if the programmer doesn’t say explicitly that he wants something other than this.
Since you don’t want to use the standard encoding always, which is the center of the question, you can at least make it a little easier by deciding what is the rule and what is the exception. On the other hand, nothing prevents you from spelling it out in UTF-8 applications or ISO-8859-1 (or any other).
But let’s face it: if you have 30 UTF-8 applications, and two different ones, and your default for future applications is UTF-8, you should leave UTF on default, and customize only ISO-8859-1 (and vice versa, if your working base is ISO, and UTF-8 the exceptions).
I will not get into the merit of what is best, for who says that A or B is the best encoding, mind (even if unconsciously). The truth is that knowing the advantages and disadvantages of each one (both mentioned have the two things), will use the most suitable for each of its applications.
Making explicit the encoding
Once you have defined your pattern, you have to consider the various points where you should explicitly specify a different encoding from it.
headers and meta tags
The encoding setting configured in Apache is used to send page headers in this format:
Content-Type: text/html; charset=utf-8
To change the encoding via PHP, the solution is this:
header('Content-Type: text/html; charset=iso-8859-1');
This way, you are overwriting the header original with the new value. Also, you can specify in the HTML itself which is the encoding:
<meta http-equiv="Content-type" content="text/html; charset=iso-8859-1" />
Important to know that don’t need of the metatag if it has already set the header. Normally, the meta tag is used in situations where you have no control to change the header original (an HTML without PHP, for example), or in situations where you want the page to work well if the user saves a local copy (after all, when accessing the local copy, usually none header is set)
Encoding of the IDE/Code editor
Fundamental that when saving the source code (whether PHP, HTML, JS) containing special characters, the document preferences are set to the encoding correct. In some editors it is one of the "save as..." options, in others you change directly by statusbar, or menus. The important thing is that you are correct.
Sending data to page
Normally, to merge data coming from DB with the page, something like this is done:
echo 'Nome do usuário: '.htmlentities( $user_name );
or even
Nome do usuário: <?= htmlentities( $user_name ) ?>
In your case, how will systems in encodings different, we need to consider the original syntax of htmlentities
:
htmlentities( $string, $quote_style, $charset, $double_encode )
If the $charset
, it assumes what has been set up in default_charset
PHP. So, in your case, it’s one more thing to be made explicit. Example:
htmlentities( $user_name, ENT_COMPAT|ENT_HTML401, 'UTF-8' )
Alternatively, instead of tinkering with all occurrences of htmlentities
or htmlspecialchars
or any other function that depends on encoding), you can use at the beginning of the pages:
ini_set( 'default_charset', 'UTF-8' ); // 'ISO-8859-1' conforme o caso
Interoperability
It is convenient that both database connections and data themselves are in the same page encoding, but in many situations this is not possible.
Furthermore, when using external data, such as in a JSON, it is normal to need the data in UTF-8 even in an ISO-8859-1 application.
For this, PHP has these functions:
utf8_encode()
, converting specifically from ISO-8859-1 to UTF-8;
utf8_decode()
, which specifically converts from UTF-8 to ISO-8859-1.
Important: these functions only make sense in interoperable contexts between two different systems. If used in a single application, it could be a sign that something is fundamentally wrong.
File system
This is a recommendation I would make even if the question did not deal with encoding. Do not use accented characters in filenames. There’s usually no legitimate reason for it "just because the system allows it".
It is one thing to save a document like "Marcos.docx Resume", which will be used by people, in a text editor. It is another to want to do this:
include('ÁlbumDeFotos.php');
In your case, even more important to avoid, as the above line will be fatally understood differently for each of the applications, and would be totally dependent on the current configuration of OS, and no more PHP or Apache.
You would end up needing the conversion functions mentioned above, and worse, if you change your server application, you run the risk of having to revise the code.
Related posts
Doubt with charset=iso-8859-1 and utf8
How to set charset=utf-8 in Mysql?
Problem with accentuation in the FPDF
https://en.wikipedia.org/wiki/UTF-8#/media/File:Utf8webgrowth.svg
– Lacobus