PHP file() returns (Unknown char) between each letter

Asked

Viewed 76 times

0

I am file() to a file and it returns between each letter of the document.

alpha dic.

Adão
Andy

Code:

$dictionary = file('alfa.dic', FILE_IGNORE_NEW_LINES);

var_dump:

string '�A�d�ã�o�' (length=9)
string '�A�n�d�y�' (length=9)

I thought it was a charset problem but even using utf8_encode() this does not remove characters.

How can I clear each line of these unknown characters?

  • 1

    Do you know what the charset of that file is? In which OS are you doing it?

  • the charset of the file is 'UTF-16 LE with BOM'

  • 1

    Have you tried using iconv? iconv('UTF-16LE', 'UTF-8', $string_do_file);

  • If not, try mb_convert_encoding: mb_convert_encoding($string_do_file , 'UTF-8' , 'UTF-16LE');

  • Well, anyway, take a look at this question: http://stackoverflow.com/questions/6980068/how-to-convert-utf-16le-to-utf-8-in-php

1 answer

2

The @gabrielhof question helped me realize that the problem was the charset of the file being on 'UTF-16 LE with BOM' instead of UTF8.

Using the editor Sublime Text 3 went to

File -> Save with Encoding -> UTF8

And the problem was solved for communication with PHP. Thank you

  • 1

    I’m glad you solved it but I’d like to take a closer look at this problem anyway, it would be possible for you to make this file available. DIC?

  • From this site http://www.winedt.org/Dict/, simply dowload any dictionary on UNICODE

Browser other questions tagged

You are not signed in. Login or sign up in order to post.