The problem is that the function str_pad
assumes that each character occupies one byte. When you use characters that are more than one byte long (such as ã), the function starts to go wrong.
No Stackoverflow in English there’s a question about that And there are four answers to that problem. Judging by the comments, two of the answers have problems (including the accepted answer) and the other two should be adequate (I did not test them however). All answers given there consist of creating a different function capable of handling multibyte characters.
Here the solution of Wes:
function mb_str_pad($str, $pad_len, $pad_str = ' ', $dir = STR_PAD_RIGHT, $encoding = NULL)
{
$encoding = $encoding === NULL ? mb_internal_encoding() : $encoding;
$padBefore = $dir === STR_PAD_BOTH || $dir === STR_PAD_LEFT;
$padAfter = $dir === STR_PAD_BOTH || $dir === STR_PAD_RIGHT;
$pad_len -= mb_strlen($str, $encoding);
$targetLen = $padBefore && $padAfter ? $pad_len / 2 : $pad_len;
$strToRepeatLen = mb_strlen($pad_str, $encoding);
$repeatTimes = ceil($targetLen / $strToRepeatLen);
$repeatedString = str_repeat($pad_str, max(0, $repeatTimes)); // safe if used with valid unicode sequences (any charset)
$before = $padBefore ? mb_substr($repeatedString, 0, floor($targetLen), $encoding) : '';
$after = $padAfter ? mb_substr($repeatedString, 0, ceil($targetLen), $encoding) : '';
return $before . $str . $after;
}
Here the solution of Ja ck:
function mb_str_pad($input, $pad_length, $pad_string = ' ', $pad_type = STR_PAD_RIGHT, $encoding = 'UTF-8')
{
$input_length = mb_strlen($input, $encoding);
$pad_string_length = mb_strlen($pad_string, $encoding);
if ($pad_length <= 0 || ($pad_length - $input_length) <= 0) {
return $input;
}
$num_pad_chars = $pad_length - $input_length;
switch ($pad_type) {
case STR_PAD_RIGHT:
$left_pad = 0;
$right_pad = $num_pad_chars;
break;
case STR_PAD_LEFT:
$left_pad = $num_pad_chars;
$right_pad = 0;
break;
case STR_PAD_BOTH:
$left_pad = floor($num_pad_chars / 2);
$right_pad = $num_pad_chars - $left_pad;
break;
}
$result = '';
for ($i = 0; $i < $left_pad; ++$i) {
$result .= mb_substr($pad_string, $i % $pad_string_length, 1, $encoding);
}
$result .= $input;
for ($i = 0; $i < $right_pad; ++$i) {
$result .= mb_substr($pad_string, $i % $pad_string_length, 1, $encoding);
}
return $result;
}
@Victorstafusa you know what can be, the problem?
– Hugo Borges
What is the file encounter? is that thing the function
str_pad()
handles bytes and not characters, accented characters take up 2 or more bytes so the final string gets a character less. I made a test here play autf8_decode()
in the first argument and it worked. Should another better way to solve this.– rray