The str_pad does not work with multi-byte characters, since it does not add all characters at once.
The original code of str_pad is exactly:
switch (pad_type_val) {
case STR_PAD_RIGHT:
left_pad = 0;
right_pad = num_pad_chars;
break;
case STR_PAD_LEFT:
left_pad = num_pad_chars;
right_pad = 0;
break;
case STR_PAD_BOTH:
left_pad = num_pad_chars / 2;
right_pad = num_pad_chars - left_pad;
break;
}
/* First we pad on the left. */
for (i = 0; i < left_pad; i++)
ZSTR_VAL(result)[ZSTR_LEN(result)++] = pad_str[i % pad_str_len];
/* Then we copy the input string. */
memcpy(ZSTR_VAL(result) + ZSTR_LEN(result), ZSTR_VAL(input), ZSTR_LEN(input));
ZSTR_LEN(result) += ZSTR_LEN(input);
/* Finally, we pad on the right. */
for (i = 0; i < right_pad; i++)
ZSTR_VAL(result)[ZSTR_LEN(result)++] = pad_str[i % pad_str_len];
ZSTR_VAL(result)[ZSTR_LEN(result)] = '\0';
RETURN_NEW_STR(result);
Source.
Note the presence of i % pad_str_len, ie it just adds a single byte, which can make an unknown byte remain. For example, if you are using the chr(160), This is for Latin1 and not for UTF8.
In Latin1, the byte A0 represents "non-breaking space". But the same thing in UTF8 requires two bytes, being them C2 A0. If you cut one of them, for example, isolating C2, you will have a ?.
If you want a "new version" of str_pad we could create a mb_str_pad():
const STR_PAD_INSERT_ALL = 4;
function mb_str_pad(string $input, int $pad_length, string $pad_string, int $pad_type, string $pad_encoding = 'utf8') : string {
$result = '';
$pad_insert_all = 0;
$pad_inset_limit = 1;
$pad_str_len = mb_strlen($pad_string, $pad_encoding);
$input_len = mb_strlen($input, $pad_encoding);
if ($pad_length < 0 || $pad_length <= $input_len) {
return $input;
}
if(($pad_type & STR_PAD_INSERT_ALL) === STR_PAD_INSERT_ALL){
$pad_insert_all = PHP_INT_MAX;
$pad_inset_limit = null;
$pad_type -= STR_PAD_INSERT_ALL;
}
if ($pad_str_len === 0) {
trigger_error ( "Padding string cannot be empty", E_WARNING);
return $input;
}
if ($pad_type < STR_PAD_LEFT || $pad_type > STR_PAD_BOTH) {
trigger_error ("Padding type has to be STR_PAD_LEFT, STR_PAD_RIGHT, or STR_PAD_BOTH", E_WARNING);
return $input;
}
$num_pad_chars = $pad_length - $input_len;
if ($num_pad_chars >= PHP_INT_MAX) {
trigger_error ("Padding length is too long", E_WARNING);
return $input;
}
switch ($pad_type) {
case STR_PAD_RIGHT:
$left_pad = 0;
$right_pad = $num_pad_chars;
break;
case STR_PAD_LEFT:
$left_pad = $num_pad_chars;
$right_pad = 0;
break;
case STR_PAD_BOTH:
$left_pad = floor($num_pad_chars / 2);
$right_pad = $num_pad_chars - $left_pad;
break;
}
for ($i = 0; $i < $left_pad; $i++){
$result .= mb_substr($pad_string, ($i % $pad_str_len) &~$pad_insert_all, $pad_inset_limit, $pad_encoding);
}
$result .= $input;
for ($i = 0; $i < $right_pad; $i++){
$result .= mb_substr($pad_string, ($i % $pad_str_len) &~$pad_insert_all, $pad_inset_limit, $pad_encoding);
}
return $result;
}
This requires PHP 7+
This is an extremely version based on the original version of PHP, indicated above, with some changes:
It supports multi-bytes, so you can do:
mb_str_pad($nome, 30, "\xc2\xa0", STR_PAD_BOTH, 'utf8');
Differences from the original version:
PS: Assuming I have not entered any bug.
Support for multi-bytes:
It supports characters that require multiple bytes. You can specify the type of encoding used, including UTF8, which is the default.
A new "STR_PAD_INSERT_ALL":
You can insert the entire string, instead of "switching to each other", if you have a string with more than one character (example: "abc"), you can specify to always insert "abc", this has a side effect since the number of inserted characters is not measured. To use, just use STR_PAD_BOTH | STR_PAD_INSERT_ALL, but that’s not necessary in your case.
Return in case of error:
Even if a WARNING is issued it will return the original string, which is not the behavior of the original function.
I tested it on ideone.com and it worked. It’s saving with utf-8 encoding?
– Marcos Xavier
Marcos Xavier, I’m using Notepad++, and yes, I’ve saved with the UTF-8 NO GOOD option. But I can’t see the blanks, only one and the others are suppressed in the view. PS: using str_replace() indicated by Isac, beauty, but would like to solve only with str_pad().
– Fernandes