mb_convert_encoding vs utf8_encode()

Asked

Viewed 6,989 times

1

I did an update on php on the server and identified that they were not being coded in the standard utf-8, the first thing I checked was the connection class I use, which in this case is adodb.

In my connection I perform a process of converting the columns that are in another format to utf-8 in this way:

 $dados[$i]  = mb_convert_encoding($dados[$i],"UTF-8");

In the php 5.2 was working normally, after updating to the php 5.6.9 he "stopped working," I checked his documentation is the same is not depreciated.

To solve the problem I used the utf8_encode, thus:

 $dados[$i]  = utf8_encode($dados[$i]);

Detail, I use adodb + firebird.

Doubts:

1° What is the difference of these functions, and why mb_convert_encoding not working anymore.

2° The return of my database is on ASCII it is possible to simplify this issue of codification utf-8 forASCII and ASCII for utf-8?

Man php.ini comes set by default default_charset = "UTF-8" // linha 680

Testing:

$f[$i] = mb_detect_encoding($f[$i]); //  ASCII

$f[$i] = mb_convert_encoding($f[$i], "HTML-ENTITIES", "UTF-8");
SUÉLLEM // Entrada
SU�LLEM // Saida

$f[$i] = mb_convert_encoding($f[$i], "ISO-8859-1", "UTF-8");
SUÉLLEM // Entrada
SU?LLEM // Saida

$f[$i] = mb_convert_encoding($f[$i], 'UTF-8', 'ISO-8859-1');
SUÉLLEM // Entrada
SUÉLLEM // Saida

In the third test it works, how strange!

Example of the method _fetch of adodb:

   function _fetch() {
     $f = @ibase_fetch_row($this - > _queryID);
     if ($f === false) {
       $this - > fields = false;
       return false;
     }
     // OPN stuff start - optimized
     // fix missing nulls and decode blobs automatically

     global $ADODB_ANSI_PADDING_OFF;
     //$ADODB_ANSI_PADDING_OFF=1;
     $rtrim = !empty($ADODB_ANSI_PADDING_OFF);
     for ($i = 0, $max = $this - > _numOfFields; $i < $max; $i++) {
       if ($this - > _cacheType[$i] == "BLOB") {
         if (isset($f[$i])) {
           $f[$i] = $this - > connection - > _BlobDecode($f[$i]);
         } else {
           $f[$i] = null;
         }
       } else {
         if (!isset($f[$i])) {
           $f[$i] = null;
         } else if ($rtrim && is_string($f[$i])) {
           $f[$i] = rtrim($f[$i]);
         }
       }

       $f[$i] = utf8_encode($f[$i]);

     }
     // OPN stuff end

     $this - > fields = $f;
     if ($this - > fetchMode == ADODB_FETCH_ASSOC) {
       $this - > fields = $this - > GetRowAssoc(ADODB_ASSOC_CASE);
     } else if ($this - > fetchMode == ADODB_FETCH_BOTH) {
       $this - > fields = array_merge($this - > fields, $this - > GetRowAssoc(ADODB_ASSOC_CASE));
     }
     return true;
   }

1 answer

5


Simply and quickly:

mb_convert_encoding converts an X encoding to a Y encoding.

utf8_encode encodes the string ISO-8859-1 for UTF-8.

Note then that the difference between the two is very wide.

5.5.9-1ubuntu4.11 A small example of encoding conversion UTF-8 for HTML-ENTITIES with mb_convert_encoding:

$str = 'É assim que você faz, na programação';

$converted = mb_convert_encoding($str, "HTML-ENTITIES", "UTF-8");

var_dump($converted, $str);

The exit is:

string(63) "&Eacute; assim que voc&ecirc; faz, na programa&ccedil;&atilde;o"

string(40) "É assim que você faz, na programação"

Settings for default_charset

I don’t know if that’s pertinent, but the configuration of my default_charset php.ini influenced when displaying the previous test result.

Look what happened to the default_charset defined as ISO-8859-1:

ini_set('default_charset', 'ISO-8859-1');

$str = 'É assim que você faz, na programação';

$converted = mb_convert_encoding($str, "HTML-ENTITIES", "UTF-8");


var_dump($converted, $str);

The exit was:

string(63) "&Eacute; assim que voc&ecirc; faz, na programa&ccedil;&atilde;o"
string(40) "É assim que você faz, na programação"

So, for you to check if this is the problem presented in the first question, then try to define your default_charset thus:

ini_set('default_charset', 'UTF-8');

Note: This is the version of my PHP, where I did the tests: 5.5.9-1ubuntu4.11

  • your answer is insufficient by the question presented, you could add something like why the php update affected the functioning of mb_convert_enconding and the same is not depreciated?

  • I even agree about adding more information, but how much the update that affected the functioning I am not aware, and this was also not requested in the question

  • was asked yes, 1° question.

  • Ah, but he said "it’s not working". And maybe that statement isn’t even correct

  • I’ll answer that here, wait, wait. rsrsrsrs

  • I also did not understand why, I’m with two versions of php. a 5.2 that transforms ASCII to UTF8 with mb_convert_enconding, is a 5.6 that does not, I just changed the version, so I had to use the utf8_encode that ASCII for UTF8, strange!

  • I still don’t understand why it doesn’t work like before, but I already have a solution that works for both versions. thank you. these tests helped a lot.

Show 2 more comments

Browser other questions tagged

You are not signed in. Login or sign up in order to post.