3
Introducing
I’m working on an e-mail box where later I have to filter the messages by senders. But the problem is in coding some "subjects".
i make a connection to the mail server through the function imap_open;
$mail_box = imap_open("{" . $incoming_server . ":" . $port . "/imap/ssl/novalidate-cert}INBOX", $username, $password) or die();
then I get the header information through the function imap_headerinfo
$header = imap_headerinfo($mail_box, $num_da_mensagem);
Between these two steps I do not manipulate anything. Everything has been sorted internally via PHP itself.
Difficulty
The problem is that when I give one print_r in that $header['Subject'] the return of some records will bring an encoded string like this:
[subject] => =?utf-8?B?UkVTOiBSRVM6IFtFWFRFUk5BTF0gUmU6IEluZm9ybWHDp8O1ZXMgc29icmUg?= =?utf-8?B?YSBBdGl2YcOnw6NvIGRvcyBQcm9kdXRvcyBlIFNlcnZpw6dvcyBDb250cmF0?= =?utf-8?Q?ados_-_WJINTERNET?=
To decode I tried to use the htmlentities and another custom function that I explain below.
function convert_encoding ($string, $to_encoding, $from_encoding = '') {
if ($from_encoding == '')
$from_encoding = $this->detect_encoding($string);
if ($from_encoding == $to_encoding)
return $string;
return mb_convert_encoding($string, $to_encoding, $from_encoding);
}
function detect_encoding($string){
if (preg_match('%^(?: [\x09\x0A\x0D\x20-\x7E] | [\xC2-\xDF][\x80-\xBF] | \xE0[\xA0-\xBF][\x80-\xBF] | [\xE1-\xEC\xEE\xEF][\x80-\xBF]{2} | \xED[\x80-\x9F][\x80-\xBF] | \xF0[\x90-\xBF][\x80-\xBF]{2} | [\xF1-\xF3][\x80-\xBF]{3} | \xF4[\x80-\x8F][\x80-\xBF]{2} )*$%xs', $string))
return 'UTF-8';
return mb_detect_encoding($string, array('UTF-8', 'ASCII', 'ISO-8859-1', 'JIS', 'EUC-JP', 'SJIS'));
}
So it would look like this: convert_encoding ($header['Subject'], 'UTF-8');
But... nothing happens. Certainly because it is not an encoding but a predefined formatting (suspect). Therefore it is opportune to say that I also did not understand yet the reason for having some normal records and others like this.
What I need
I wonder why some messages are coming with the Subject coded and others not. Understanding the root of the problem can help me see a different horizon to reach a viable solution.
If it is a purely technical problem, if possible, what technique can I use to try to convert this encoding to something readable?
Great. I learned one more. Thank you so much for sharing this knowledge. .
– Adan Ribeiro