How to move Int to Base64 in PHP?

Asked

Viewed 738 times

11

Base64 can store 6 bit at each character used. Assuming we are using int64 or uint64 we use 64 bits, which could be represented in ~11 characters.

I tried to answer this question, but PHP fails to convert values correctly.

$int = 5460109885665973483;
echo base64_encode($int);

Returns:

NTQ2MDEwOTg4NTY2NTk3MzQ4Mw==

This is incorrect, we are using 26 characters to represent 64 bits! This is insane. I even understand the reason, it uses the value as string, not as int. Only that the conversion to string makes use of 19 bytes, which therefore (19*8)/6 characters are used by PHP.

However, other languages handle byte-level, such as Golang:

bt := make([]byte, 8)
binary.BigEndian.PutUint64(bt, 5460109885665973483)

fmt.Print(base64.StdEncoding.EncodeToString(bt))

Returns:

S8Y1Axm4FOs=

The S8Y1Axm4FOs= is exactly 11 characters (ignoring the padding), which is exactly the 64 bit representation. In this case you can recover the value using the binary.BigEndian.Uint64 after the decode base64.


How could I have the same Golang result in PHP?

3 answers

11


The best way to do this in PHP is by using pack. This function will allow you to have a big-endian byte order implementation.

<?php

$byte_array = pack('J*', 5460109885665973483);    
var_export( base64_encode($byte_array) );

// Output: S8Y1Axm4FOs=

To reverse this process, you can use the opposite function unpack

<?php

$encoded = "S8Y1Axm4FOs=";

$decoded = base64_decode($encoded);

var_export( unpack("J*", $decoded) );

// Output: [ 1 => 5460109885665973483 ]

The J* represents a 64 bit, big endian byte order

  • The pack always saving. For the first time I saw usefulness in the other options of the pack, besides the C and of H, I’ll investigate them further. D

3

The response of @Valdeir Psr answers the question and solves the problem. However, I had a completely different idea of solving the situation by using bitwise.

I thought of simply dividing the value every 6 bits, then encoding it for Base64. This would not be approve of side-Channel attacks (in the same way as the original PHP), but would be sufficient for the purpose, I believe.

I tried to execute this idea, and... it worked. So, I’m sharing here, although I will use the pack.


So just do:

function base64_encode_int64(int $int) : string {
    $alfabeto = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_';
    $base64 = '';

    for($l = 58; $l > 0; $l -= 6){
        $base64 .= $alfabeto[($int >> $l) & 0x3F];
    }
    $base64 .= $alfabeto[($int << 2) & 0x3F];

    return $base64;
}

The last shift must be inverted, because it has only 4 bits, 6 bits are needed. Then it is necessary to add 2 bits at the end, for this reason the offset "to the opposite side".

To decode we use the |, which is the simplest solution, I believe.

function base64_decode_int64(string $base64) {
    $alfabeto = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_';
    $int = 0;

    if(mb_strlen($base64, '8bit') !== 11){
        return false;
    }

    for($l = 58; $l > -3; $l -= 6){
        $letra = strpos($alfabeto, $base64[(58-$l)/6]);
        if($letra === false) {
            return false;
        }
        if($l > 0){
            $int |= ($letra) << $l;
        }else{
            $int |= ($letra) >> 2;
        }
    }


    return $int;
}

I don’t believe that the strpos is the best option, plus the amount of if is in bothering a little. That was necessary because the input ($base64) must use the same dictionary, so must return false in case of error and limited to 11 characters.

The if($l > 0){ I brought it into the for, but I don’t believe it’s not ideal. I did it so I wouldn’t have to create a new condition outside the loop (duplicate the if($letra)), but I believe there must be a way to make this "universal", maybe doing some shifts before (the other way around), I don’t know.


Now the tests:

echo $int = 5460109885665973483;
echo PHP_EOL;
echo $b64 = base64_encode_int64($int);
echo PHP_EOL;
echo base64_decode_int64($b64);

Returns:

5460109885665973483
S8Y1Axm4FOs 
5460109885665973483

Test it out here

-1

I did this, but I don’t know if it’s acceptable... I used the method of converting binary to hexa/octal by applying it to Base64 (i.e., converting each 6-bit group into a corresponding character) and to decode I used the positional conversion.

This method does not apply padding, but the b64_decode function perfectly reverses the effect of b64_encode, so even if it is not completely correct according to the conversion patterns for Base64 (which I do not know... I just expanded the conversion method to hexadecimal), this should serve to save space and encode integer reliably (I hope).

<?php
  function b64_encode(int $number):string {
    $binary = decbin($number);
    $length = strlen($binary);
    $digits = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";
    $result = "";

    // Each iteration converts one 6-bit group into one character from the $digits string
    for ($i = $length - 6; $i > -6; $i -= 6) {
      $l = $i >= 0 ? 6 : 6 + $i;
      if ($i < 0) $i = 0;
      $sixBits = substr($binary, $i, $l);
      $result = $digits[bindec($sixBits)] . $result;
    }

    return $result;
  }

  function b64_decode(string $number): int {
    $digits = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";
    $length = strlen($number);
    $result = 0;

    for ($i = $length - 1; $i >= 0; $i--) $result += strpos($digits, $number[$i]) * 64 ** ($length - 1 - $i);

    return $result;
  }

Browser other questions tagged

You are not signed in. Login or sign up in order to post.