1
I am passing the Maxmind csv database to mysql and some accented characters are giving problem. In Latin appears a ?
and in UTF nothing appears.
CSV : Östra Frölunda
UTF8 :
LATIN1: ?stra Fr?lunda
I used the code below that I found on a site to import the whole file:
CREATE TABLE `location` (
`locID` int(10) unsigned NOT NULL,
`country` char(2) default NULL,
`region` char(2) default NULL,
`city` varchar(45) default NULL,
`postalCode` char(7) default NULL,
`latitude` double default NULL,
`longitude` double default NULL,
`dmaCode` char(3) default NULL,
`areaCode` char(3) default NULL,
PRIMARY KEY (`locID`),
KEY `Index_Country` (`country`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 ROW_FORMAT=FIXED;
load data infile 'GeoLiteCity-Location.csv'
into table `location`
fields terminated by ',' optionally enclosed by '"'
lines terminated by '\n'
ignore 2 lines;
csv has other cities with illegible characters, but these I won’t even care at the moment, but I wanted to care the way it is to then sanitize.
How did you determine that the CSV was latin1 to choose the charset? One of the first things would be to correctly evaluate the CSV (even before importing).
– Bacco
From what I saw on another site, the file is iso. All examples I saw the charset is latin1... the only one I saw as UTF did the file conversion.
– Papa Charlie
Latin 1 is iso-8859-1 (practically). But without being sure, it gets complicated. The best thing is to look in a code editor to be sure, or even open CSV as text in a browser that has manual coding option for you to test which works. For those who have a hexadecimal editor at hand also just look at the bytes. Without determining the correct charset, everything will be at the base of the kick (and an answer that fits your case may not suit anyone else).
– Bacco
Maxmind’s own PHP code that works with . DAT, defines ISO usage:
mb_internal_encoding('ISO-8859-1')
. So I believe that the other files are iso too, but anyway, then I try to find out more efficiently which charset.– Papa Charlie
Great chance to be, and in this case even if it is, need to see how it is that you are checking the data on the screen (one thing is to save wrong in DB, another is to view)
– Bacco
But if the charset were the same it would not be recorded as csv?
– Papa Charlie
– Peter
@Papacharlie Consider commenting on this question the current situation and maybe even contribute an answer if case already solved.
– Ismael
@The Maxmind csv was incomplete and had some errors - in addition to not converting the characters correctly. I decided to leave the free version.
– Papa Charlie