I get different results when I convert from Character to Decimal in PHP and Java

Asked

Viewed 50 times

3

When I convert á to decimal I get the result 225 with this code in Java:

public static int charToDec(String text){return (int) text.charAt(0);}

When I convert á to decimal I get the result 195 with this PHP code:

function charToDec($text){return ord($text);}

How can I arrange to convert to the same value ??

Remembering that this only happens with "special characters".

1 answer

4


This is because the ord() does not support UTF-8, you have two solutions to match the values.

A better explanation of what occurs can follow the idea:

$hex = unpack('H*', 'á')['1'];
// = "\xC3\xA1"

echo hexdec($hex['0'] . $hex['1']);
// = 195

Therefore, the first byte (\xC3) is the 195 and it is the result of ord(). This is because PHP uses the value of á of UTF-8, which are two bytes (\xC3\xA1), being the first of them 195.


Change UTF-8 to ISO-8859, for example:

function charToDec($text)
{

     $text = mb_convert_encoding($text, 'ISO-8859-1', 'UTF-8'); 
     $text = mb_substr($text, 0, 1, '8bit');

     return unpack('C', $text)['1'];  

}

Thus:

charToDec('á'); //= 225

charToDec('a'); //= 97

I believe that this is enough, but I am not confident that all cases will be the same as Java.


The other way would be to use UTF-8 as default, this would require the change in both Java and PHP, in which case you could use unpack for example shown above and in Java use some equivalent method.

  • Your answer helped me a lot, but could complement with some example that would not look the same in Java, thanks ;D

Browser other questions tagged

You are not signed in. Login or sign up in order to post.